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ABSTRACT 

In response to a provision of the Education 
Amerdments of 197B concerning evaluation t^ractices and" procedures, 
this report examines four aspects of evaluation in education which 
focus on how funds allocated to evaluation can be spent more 
effectively and yield more useful results, on this basis, groups of 
recommendations are made to both Conaress and the Department of 
Education, '^hese are presented in an opening summary and the 
discussed more fully in each chapter. The first chapter is designed 
as an introduction to the backaround and scope of the report. A 
definition of evaluation is aiven in chapter two, which also 
addresses conaressional concern with uniform methods and measures in 
*he context of delineating different types of evaluation aad their 
appropria^-e use. Improvement of the quality of evaluation forms the 
sabstance of chapter three. Discussion in chapter four centers upon 
the utilization and dissemination of evaluation results. The final 
chapter maXes recommendations for improved mariagenent and 
organization and presents implications for this derived from 
precedina chapters. Appended are a glossary, references, and three 
appendixes concernina types of federal educational evaluation 
activities, the pWrties who conduct them, and the organization of the 
evaluation system at state and local levels. (Author/AEP) 
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Preface 



The Eduoation Amendments of 1978 (P,L, 95-561), which 
reauthorized the major federal elementary and secondary 
school programs, included the following provision i 

f 

STUDY OF EVALUATION PRACTICES AND PRCXZEDURES 

SEC. 1526. The Commissioner of Education 
shall conduct a study of evaluation practices 
and procedures at. the national. State, and 
local levels with respect to federally funded 
eleihentary and secondary educational programs 
and shall include in the first annual report to 
Congress submitted more than one year aftejr the 
date of enactment of this Act proposals and 
recommendations for the revision of 
modification of any part or all of such 
practices and procedures. Such proposals and 
recommendations shall include provisions^ 

(1) to ensure that evaluations are based 
on uniform methods and measurements; 

(2) to ensure, the integrity and 
independence of the evaluation process; 
and 

(3) to ensure appropriate follow-up on 
the evaluations that are conducted. 

This requirement has "provided the impetus for the 
present report. In response to the legislative request, 
the National Academy of Sciences was asked by the Office 

vii 



at KduPAtlPn (015) to uncJ^ctaK^ M atudy program 
«vftlu«ti,on in ociuaatlon. The purpoaa of the atudy w«8 to 
r«Qomm«ncl w«ya of lnor«Afling th« efC©otiv«noaa an(3 

otArted l«t« in 1979 and aompUtod undar th^ auapioaa of 
kho naw Daparfcmant of Edupation, tha auoooaaor Aganoy fco 
OE. 

It wAa explicit in tha raquaat mada by OK tKat the 
acre ofvthe otudy would be a report by an expert 
committee. The Committee on Program Evaluation 
Education came to life in early 1980^ convened under^.the 
auapioea of the Aaoembly of Behavioral and Social ^ 
Sciencea. Its membership was selected to repreaent 
appropriate disoiplinea aa well as different viewpoints 
and reapousibilities regarding evaluation, in recognition 
of 'the fact that the pcoblema to be addreaaed related aa 
much to the organization r management, and policy uses of 
evaluation as to questions of evaluation strategy, 
methodology, and quality. The disciplines represented on 
the Committee included conununications, economics, 
educational administration, educational psychology, 
experimental psychology, political science, social 
psychology, sociology, sociology of education, and 
statistics (psychometrlcs . The experience represented 
included! carrying out large-scale and smaller 
evaluations in different settings (university, local 
school system, private sector ) ; commissioning evaluations 
and managing more general pr6grams of support for applied 
social research and developinent (H&D) within several 
government agencies; serving as staff to a major 
congressional education committee; and carrying out 
pertinent research on methodology and utilization of 
evaluations and on social R&D. Several ihembers had 'also 
conducted general assessments^ of the field of evaluation. 

The Committee held three two-day meetings and a longer 
working conference to develop the substance of the, 
report. Richard A. Berk of the University of California, 
Santa Barbara, assisted the Committee as a consultant 
during the working conference. During its first two 
meetings, the Committee focused on defining the key 
issues to be addressed. Senior Staff from the Department 
of Education and from education committees in Congress 
met with the Committee to give us the benefit of their 
views. (See Appendix D for a list of participants.) In 
^addition to the concerns expressed by Congress with 
methods, integrity, and follow-up, Department officials 
asked that the following organizational topics be 
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AddrM0«di iQgAfelan of «v«lu«fcion agfelvlfei(9B wlfehin feh« 
P<ip«vtm«ntf ooprdinfttiQn ot «vfllu«fclan within fch« 
D0pjirtin«nti ()artiaipAtion in «vAl,uAtion d^aign And u«a by 
progr«m And plAnning Q«;eiaiAi«, And oontinwing AdviAocy 
mAahAnlamA Cor AvjiluAtiont DApACtmAnt AtACe aIao rAiAAd 
thA epllowing nQnocgAnisAtionAl iAAUAAi dlAtinguiAhinq 
Arnong typAa oC ^AVAlUAtionA^ plAnning ot avAluAtionai 
atVAtAgiQ QonaidArAtlona in avAluAtion mAnAgAmantf And 
AppropriAtA utiliiAtion. 

^8ti(fftinfl from thoAA AHpcAAAAd oonoArnAi tha CommittaA 
Axplorad othAc calAtAd Iaauaa And oAmA to ocgAniaA tha 
.kvlii^rt Around four mAjor topio acaaai diAtinguiAhing 
batwAAn avAluAtion typAA And ohopAing AppcopciAta 
.AtratAgiAA And pcooaduraai improving tha quality of 
avAlUAtionai inoraaaing the aCfaativa uaa pf avAiuAtiona; 
And improving tha organisAtion And mAnAgAmAnt of 
fadArAlly fundad avalUAtionA in AduoAtion. Tha 
oongraaaionaX oonoarn with uniform mathoda and maASuraa 
^WAA Aubaurtiad undar tha broadar topic of avaXuation 
stratagiaa And prooadurasi einoe odnaidaration of methods 
and meAsuras ia poaslble only in the context of a 
specific set of policy questions and After an evaluation 
atrAtegy and procedijre have been determined. 

In carrying out its study, the Committee relied on 
various kinds of information to supplement the members* 
knowledge and experience. Members and staff conducted 
informal interviews with employees and ex-employees of 
OB, of the l^partment of Bducation, of other federal R&D 
suplpbrt agencies, and with congressional staff familial 
with the provision calling for the assessment of 
evaluation practices. (For a list of persons 
interviewed, see ^Appendix D.) Two papery were ^ 
commissioned from consultants to sut>pLy detailed 
information on the evaluation activities within the 
Department and on the perff^rmer communiti;es that carry 
out evaluation studies; they appear as AppeYidixes A and 

B. A third paper, contributed by Committee tflember Freda 
M. Holley, provided insight into evaluation activities at 

, the state and Ibcal levels and Is includedi^as Appendix 

C. Working papers were also prepared by i^^nd 
^Richard A. Berk and by members Marvin. C. i^ffiun, Robert F. 

Boruch, and Robert K. Yin. These have been publ^dhed by 
their individual authors under the aegis of the Center 
for the Study of Byaluation (Baker 1980). Materia? Jrom 
these papers and from various drafts of chapter sections 
prepared by other Committee members has been incorporated 
in the report. Additional background material available 



t:Q-fch« Commit inPlMdert aqenoy pUnnlnq dQaument«, 
^nnu^l reports, ana internal qrUiquoe relating to 
^v«lw«tion ^Ptivitiea 4nd th<iiF appliq^itlon to deoisiona 
about pro^i:4n\fii« 

TlUfl ^«pprt ia not a oompr<th«nfliv« examination ot 
pi^ogram ©valuation in ©duoation, Tha tntant or th« 
aponaoiring aganoy waa to have a group qf axparta apply 
thair Knowiadg^ and axparianga to tha pcoblama idantitiad 
by Congtaaa and tha Dapactmont* Thia has atcuoturad both 
tha aalaotlon of aubjaot mattar and tha n<|tuca ot tha 
avidantlary baaa, whloh la drawn largaly from axiatlng 
data and analyaaa* Nalthar monay nor tima waa available 
for an'ampicioai atudy, auoh aai an anaminatipn of tha 
quality of proauramant inatrumanta^ of caaulting 
propoaalaf or of evaluation raportai ayatamatlo aucvaya 
of aponaora or parformara on thalr axparianoa with 
different typea' of evaluations and management praotloea; 
or primary analyais of the uaa of evaluation reaulta* 
However r the Committee waa able to use the findinga of a 
aeoond and more extenaive project funded by OEi in 
reaponae to a oongreaaional request* Thia projectf 
located at r^octhweatern University, included collection 
of empirical data and examination of the literature on 
evaluation of federally supported education programs at 
the national, state, and local levels. During its third 
meeting, the Committee reviewed the reports of this 
project and became familiar with ita findings (Boruch and 
Cordray 1980), In addition, the director of the 
Northwestern project served on the Committee, which was 
thus able to take advantage of the complementary nature 
of the two project*. * 

The Committee is grateful for the. assistance it 
received from many other sources. We owe special thanks 
to John W. Evans, the former head of the central 
evaluation unit of the Department of Education, who made 
himself and his staff fully available to the Committee, 
and to Marshall Smith, former executive assistant to the 
Secretary. They and other staff within the Department of 
Education provided much data and were generous with their 
time and the effort needed to comply with our requests 
for material and information, staff members from the 
National Science Foundation and from Congress also gave 
generously of their time. 

Members of the Assembly of Behavioral and Social 
Sciences (ABASS) of the National Research Council and of 
the Report Review Committee of the National * Academy of 
Sciences provided thoughtful^ comments on an earlier draft 



HuqtniA QrohmAni (lenpgijiu for rtpprl^a of MhMt 

rindllyi w«i wl«h to l^hink Roai e. Kivifmnni whQg@ 
«dn)inla^rA<;lv« aupiport tarly an e<iQtli^ii^a<j tha 
orgAnliA^lon «nd Sirat maatinga of our Commligiaai ami 
Diana h* Qoldniani who alaly took ovai: Irom har aa our 
admlnlatratlva aaorat^aryi typaa tha many varalona ati tha 
raportf and provldaa ua with muoh naadad lagiatloal 
aupport and taohnloal aaaiatanoa. 

p«tar H. iioaair chair 

Committaa on Pro^cam lilvaluation in l^duaatian 
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Summary 



Evaluation as an established field of applied social 
science research has grown rapidly over the last 20 
yekvs, accompanied by the. expectation that the empirical 
knowledge resulting from evaluation studies would improve 
the process of making decisions about social programs. 
In education, more than $40 million is now spent per year 
for evaluation activities by the Department of Education; 
about $60 million more in federal funds is spent by other 
fedeipal agencies and by state and local agencies. ^ But as 
the number of evaluation studies and their sbphisltication 
have grown, so has concern that evaluation work has not 
lived up to its potential. In response to such concerns 
on the part of Congress, the Committee on Program 
Evaluation in Education examined four aspects of 
evaluation in educationi the varieties of evaluation and 
their respective role^; the quality of evaluation 
efforts; the use of^j^valuation results; and the 
organization and m^inagement of evaluation activities. We 
focused on these topics because they were identified to 
be of greatest interest to the two primary audiences for 
our reports members of Congress and their staffs and 
hdgh-levfl officials in the Department of Education. 



Two major findings permeate the Committee's report. 
First, evaluation must be viewed as a system that 
involves many ^organizations and many parties. Attempts 
to improve the quality of evaluation studies or to 
increase the use of Evaluation results must d^al with 
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systemic problems rather than with the specific 
shortcomings of any Individual evaluation. Therefore, 
much of this report deals with such systemic Issues as 
the role of evaluation, the context In which It takes 
place, and the diverse Interests of the many groups 
concerned with federal education programs. Second, both 
the quality and the use of evaluations could t)e 
considerably enhanced through better management 
procedures. At present, the processes for soliciting and 
funding studies constrain creativity; quality controls 
are insufficient; limited review procedures at all stages 
Inhibit the development of an active intellectual 
marketplace — the most effective arbiter of quality and 
use. Hence, most of our recommendations are designed to 
Improve the procedures that now govern federally funded 
evaluations in education. Improvement in management 
procedures is the .single most important step that 
Congress ^nd the Department could take if they wish to 
achieve better quality in evaluations and to Increase the 
likelihood that evaluation results will be used 
appropriately. 



The Role of Evaluation 

To understand what evaluation can contribute to the 
making of policy, one must understand its llmiti^d role in 
affecting decisions that are largely shaped by other 
forces. In any political decision, many parties with 
diverse interests are likely to have a stake, and 
evaluators are often asked to respond to several 
audiences and competing constltuenales . Even though 
evaluations are frequently conducted at the behest of 
governmental authorities making decisions about programs, 
other audiences will respond to evaluation information as 
well and use or not use it as it furthers their 
objectives. Different audiences have need for different 
types of information; different policy Issues require 
different types of studies. Unless the policy questions 
to be addr&ssed are clear to those who ask for 
evaluations and to those who carry them out, the 
perception that much evaluation work is irrelevant to the 
policy process is likely to persistr 

The diversity of research activities all going under 
the general name of evaluation has led to considerable 
misunderstanding. The diversity has come about because 
it has become evident that studying the effectiveness of 
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operating programs — the traditional focus of 
evaluation — does not answer some important questions; 
research is also needed in planning and implementing 
programs* During the planning phase, there are questions 
of need and how to meet those needs. Survey and 
ethnographic studies can establish the extent and 
distribution of an educational problem; controlled pilot 
testing and field tests can determine the effectiveness 
and feasibility of alternative interventions for 
relieving the problem; and economic analyses can be used 
to make cost estimates. Once a program is established 
and operating, there are questions of fiscal and coverage 
accountability. Analyses of administrative records can 
determine whether funds are being used properly and 
whether the program is reaching the intended 
beneficiaries, although supplementary fiscal audits and 
beneficiary studies are sometimes required. Finding out 
whether the program is being implemented appropriately 
requires, in addition to program administrative records, 
special surveys of program services and ethnographic 
studies. Finally, there are questions of program impact; 
they can be addressed definitively only through rigorous 
and often costly research methods. Consequently impact 
evaluation should be undertaken only if the requisite 
skills and resources are available* 

Not all programs can be fully evaluated! that is, not 
all questions can be answered for all programs. In 
particular, meaningful impact e^luation is possible only 
for programs for which intended beneficiaries ahd effects 
can be clearly specified. There' are two kinds of 
programs for which such specification is extremely 
difficult or impossible. For a program having vague 
goalfiji^or many diverse goals, evaluators and those who 
commission an evaluation must be able to agree, on which 
goal should be assessed and whether appropriate measures 
are available to assess it. For a program in which local 
sites are given autonomy to develop their own specific 
objectives and means of reaching them, one cannot 
evaluate for national impact by aggregating effects over 
many diverse sites (though the effectiveness of . ^ 
individual local projects may be evaluated). General 
judgments abou€ a national program become possible over 
time, however, as knowledge from studies of individual 
sites accumulates. 

In an effort to increase the quality of information 
furnished through local evaluations. Congress has sought 
to encourage uniformity of methods and measurements in 
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evaluation. At this time, the Committee does not 
consider such uniformity i>n appropriate means for 
controlling quality, since requiring uniformity may 
prematurely Inhibit further advances In metho<3ology. 
Instead # evaluation methods should be subjected to the 
full test of the Intellectual marketplace through 
Intensive review and critique. 



Improving the Quality of Evaluations 

The f^w systematic or Informal surveys of evaluation 
studies In education give some credence to the frequently 
voiced dissatisfaction with the general level of their** 
quality. There appear to be several reasons that the 
quality of evaluations In education has been found 
wanting. First, the unrealistic expectation that 
compllcatec) evaluation Issues can be addressed by a wide 
variety of agencies has led to some Inappropriate 
assignments of evaluation responsibility. For example, 
only a few large and sophi;itlcated school systems and a 
handful of states have the capacity to carry out rigorous 
studies of program Impact. In addition, the objectivity 
that Is necessary for good evaluation Is sometimes 
compromised at the state and local levels because much of 
the evaluation funding, though supplied by the federal 
government. Is controlled by local program managers or 
state administrators. Evaluation requirements Imposed on 
local and state authorities should match their 
capabilities, and fiscal and organizational arrangements 
should foster the Integrity of local and state studies. 

A second reason for the low quality of evaluations 
arises from the way In which federal evaluation 
activities In education are managed. Though the amount 
of money spent on evaluation represents only about 0.5 
percent of the total federal support for education. It Is ^ 
a major source of Income for private-sector research 
firms; moreover , evaluation work Is heavily concentrated 
among the larger of those firms. This concentration has 
come about because of the current procedures for 
sponsoring and carrying out evaluations. Procurement 
documents are highly complex and qf^ten Include detailed 
specifications on the various technical aspects of 
evaluation. Internal planning procedures and design of 
requests for proposals (RFPs) take so long that little 
time is left for response. Universities, minority firms, 
and small businesses, unlike lar^e firms, are unable or 
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unwilling to compete under such conditions. The, lack of 
diversity among evaluation contractors reduces the 
possibility of new ideas entering the evaluation system 
and thereby improving it. Perspectives of beneficiary 
populations, in particular, are underrepresented on both 
the sponsor and the performer sides« 

Flexibility in evaluation, which could contribute to 
quality, has also been reduced because of emphasis in the 
p&st on large studies. The restrictions on creativity 
imposed by this approach are aggravated when a single 
individual or small group within the Department develops 
the main procurement instruments as is usually the case. 
An additional constraint on flexibility and creativity is 
the current monitoring process, which makes it difficult 
to adjust the course of a study because of changed field 
conditions or because a different research direction is 
warranted. 

A third explanation for problems of quality is that 
the intellectual marketplace for appraisal and scrutiny 
of evaluations has yet to be fully formed. Generally^ 
there is no review by outside experts during the 
procurement phase when the main elements of a study are 
being designed; the lack of diversity among competitors 
for evaluation work further inhibits opportunities for 
the marketplace to operate? and, upon completion of a 
study, external review of final reports happens only 
sporadically. Institutional mechanilsms for -encouraging 
ample discussion by experts and parties at interest of 
plans for and findings of major studies are spotty at the 
federal level; they are largely absent at the state and 
local levels. . 



Using the Results of Evaluation 

A frequently voiced criticism of evaluation is that 
evaluation findings are seldom used. Implicit in this 
criticism is the notion that utilization means direct and 
often immediate changes in policy and program. In fact, 
there are several different types of utilization, not all 
immediately apparent. Moreover, the disseminatioa of 
findings does not automatically lead to utilization, nor 
is utilization synonymous with chi^nge* 

Evaluation findings may be used for making specific 
changes at a given time, as commonly envisaged in 
discussions of utilization. Findings may also be used to 
confirm that changes are not needed. But information may 
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also be considered and not used because It Is 
inappropriate or because indicated directions for 
policy are infeasible. Moreover, even when th<ere is no 
inunediately discernible use of jcnowledge derived from 
evaluations, it cumulates over time and is slowly 
absorbed, eventually leading to changes in concepts and 
decision perspectives. 

There are important limits to the use of evaluation 
results in the short run. Social problem solving is and 
ought to be a political process; the forces and events 
impinging on decisions about programs are often more 
powerful than empirically derived evidence. The 
environment in which decisions are made seldom permits 
swift and unilateral action; new information may actually 
slow down the process, since it may make decisions more 
complicated. For these reasons, while evaluators and 
sponsors should do their best to disseminate evaluation 
findings, they cannot ensure utilization. 

Dissemination can be improved in a number of ways, 
however. At the very least, evaluation results must be 
communicated to the primary audience* Copies of reports 
must be available; primary data should be accessible for 
reanalysis. Unfortunately, none of these minimal 
dissemination steps is now routine. Assuming that 
information is made available, other important factors 
affecting its use. include whether it is perceived to be 
objective and whether it is structured and reported in\a 
way that is relevant to potential users. Timeliness is 
also important, particularly when direct, application to 
specific decisions is intended. 

Because evaluation results are more l^^^ely to' be used 
when they address issues of importance to specific 
audiences, concern wittiT the use of evaluation findingi^ 
cannot begin vheh final reports are ready to be 
disseminated. The primary audience and its , information 
needs of a given evaluation should be identified at the 
inception o£ the study. Such initial identification will* 
help define the type of evaluation to be undertaken, the 
Issues to be addressed, the sort of information to be 
collected, and the form of reporting and comrounlcatlon 
that is likely to be most effective. The language of 
evaluation reports is often a barrier to uses reports 
ly^t be intel^^^^^^ ,intende(l„audi^^^ 
should be augmented by more Informal means of 
communication, including person-to-person interpretation 
of results. Linking mechanisms that mediate between 
researcher and audience can facilitate the spread of 
knowledge and the utilization process. 
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Organizing and Managing Evaluation Activities 

The Departaibnt of Education has accountability j^ad 
oversight responsibilities with regard to federal 
ed^ication programs and must carry out evaluation 
activities that address those responsibilities. The 
Department should also develop knowledge about programs 
that can be used to improve both their management and 
their contribution to more affective education. Finally, 
the Department should be able to formulate new programs 
based on tested alternatives that speak to unmet needs in 
education • 

At presentf evaluation t^'sponsibilities are assigned 
to several different units within the Department, and to 
state and local agencies. Fiscal audits and 
investigations on compliance with civil rights laws are 
appropriately carried out by'offices created specifically 
for ther.e functions. Similarly, local and state agencies 
are appropriately responsible for supplying fiscal and 
beneficiary information needed to administer federal 
, programs. However, the assignment of other types of 
evaluation responsibilities among levels of government 
and within the Department varies remarkably from program 
to program, despite the existence of a central evaluation 
unit. 

Though some decentralization of activities is 
appropriate, assignment of responsibilities should be on 
a more systematic and purposeful basis. The Committee 
suggests the following guidelines: 

• Collection of information on beneficiaries served 
and on allocation of resources should continue to be a 
requirement for state and local agencies. When agencies 
do not have adequate capability for accurate reporting, 
technical assistance ought to be provided. An important 
caveat is that reporting requirements should not generate 
more information than can be digested at the level 
(federal or state) receiving the reports. No requirement 
should be imposed on all state and local agencies that 
goes beyond the basic reporting needed for accountability 
functions, such as studies of program effects and 
cost-effectiveness analyses'ii Such studies should be done 
__on- a-tiationa 1 -sample ba s i s-or-by-selec ted local- or a t a te 
agencies of proven competence and with sufficient 
resources. 
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• The Inspector General should continue to have 
responsibility for fiscal audits. Coverage of 
beneficiaries and program delivery should be monitored by 
the officials who administer programs at the federal 
level, but the central evaluation unit should, from time 
to time, run Independent studies as checks. As its major 
responsibilities, the central evaluation unit should, in 
cooperation with the program units, carry out studies to 
establish whether and how specific programs can be 
evaluated, sponsor documentation of program process and 
implementation, and support studies aimed at. the 
improvement of existing programs or the development of 
new ones. The research office of the Department should 
help administer grant programs for evaluation studies and 
support research on the methods and processes of 
evaluation. 

Decentralizing evaluation responsibilities to any 
degree creates the problem of how evaluation dollars can 
be used effectively when they are dispersed among three 
levels of government and among many of the Department's 
units. First, adequate reporting of evaluation 
activities and expenditures must be instituted at all 
levels and for all units. Second, the central evaluation 
unit should be responsible for the coordination of 
evaluation throughout the Department, particularly with 
respect to planning and reporting procedures. "The unit 
should also provide technical assistance and review for 
the design and procurement of individual studies done by 
other units', and it should be responsible fot a 
systematic process of review of interim and final reports 
by inside and outside experts. A special dissemination 
branch within the central unit should help other offices 
with dissemination o£ findings from evaluation studies. - 
The central evaluation unit will not be able to carry 
out effectively the suggested evaluation and coordination 
. responsibilities as long as it is subsumed within the 
management arm of the Department. The implicit message 
of this organization is that only the management 
perspective of evaluation is important. The Committee 
believes that evaluation must address the substance of 
policies and programs, not only their management. 
Therefore, administrative arrangements, should be . chang^e^^^^^ 
so as to give top decision makers within the Department 
more direct access to the central evaluation unit. 



RECOMMENDATIONS 



The Conunittee has two setd of recommendations, one for 
Congress .and cue for the Department. The recommendations 
are presented and the discussion of them summarized in 
the following two sections; the chapter numbers in 
parentheses indicate where the more detailed discufjsiona^ 
are found. 



Recommendations to Congress 

The first recommendation to Congress is concerned with 
obtaining a better match between the information that 
results from evaluation studies and the information that 
is useful in making decisions about programs. The next 
three recommendations, C-2, C-3, and C-4, are intended to 
improve oversigh,t and accountability for evaluations 
carried out with funds from federal education programs. 
The last recommendation to Congress addresses management 
constraints external to the Department. 



Recommendation C-1. When Congress requests evaluations, 
it should identify the kind of question (s) to be 
addressed . (Chapter 2} 

Given the diversity of evaluation activities 
misunderstandings about what information is needed have 
frequently arisen between Congress and the Department and 
its evaluation contractors. Congress should attempt to 
make more explicit whether it needs information about 
program services, about, program coverage, about program 
impact/ or about other program ^specf&. such clarity 
will make it more likely that useful information will be 
delivered as a result of an evaluation effprt. The 
primary audience (s) for the results of the jrequested 
evaluations. should also be identified, since different* 
audiences need different types of information. 

Clarity of congressional intent can be brought about 
in two ways. When specificity about questions and 
audiences is not possible ahead of time, evaluation staff 
within the Department need to engage in a continuing ^ 
diratogue-wi^th-memberavaf-Congress-and-their-staffs to 
refine the policy issues to be addressed. Alternatively, 
legislative language can specify such issues when 
Congress wants specific information. Legislative 
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languc-^e regarding evaluation should refrain? however # 
from specifying details of research method (such as 
sampling procedure or use of control groups) or of 
measurement. The choice of methods depends in part on 
specific evaluation conditions and contexts and should be 
done by technical experts only after careful 
consideration of all facets of an evaluation. 



Recommeti ation €--2. Congress should separate funding for . 
evaluations conducted at the state and local levels from 
program and administrative funds . (Chapter 3) j 

Under present circumstances i the amount of money . 
invested and the kind of evaluation done at the state and 
local levels is, in too many instancest controlled by 
those who administer and run programs. This puts the 
quality and integrity of state and local evaluation 
activities in jeopardy. Moreover, the current 
arrangement makes it Impossible to know how much of the 
federal funds potentially available ^of evaluation ar,e 
actually used for that purpose. Congress "m^y a^ao wish 
to consider a percentage set-aaJLdf ^ for evaluation of 
programs at the state and local levels, as is now ^ 
legislated for a number of programs at the national level. 

Recommendation C-3. Congress should institute a 
diversified strategy of evaluation at the state and local 
levels that would impose minimum monitoring and 
compliance requirements on all agencies receiving federal 
funds but allow only the roost competent to carry out 
complex evaluation tasks . (Chapter 3) 

All state and local agenciej3 receiving federal funds 
for education programs should be* required to provide an 
accounting of the distribution of funds and of 
beneficiary coverage for each progr^^n. When spe^cific 
services and procedures are mandated, these should also 
be subject to reports to ensure compliance. The Congress 
^should require the Department to institute appropriate 
quality control procedures to raise the quality of state 
and local data. Evaluation tasks that go beyond 
accountability questions, however, should only be 
required of state and local units on a highly selective ^ 
basis. Congress may wish to consider authorizing a 
competitive grants program, possibly administered through 
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the National Institute of Education , for school systems 
and states that would provide for funding a few of the 
most technically promising proposals for impact 
assessments of local programs or for program improvement 
based on evaluation of alternative program strategies. 

Recommendation C-4. Congress should require an annual 
report from the Department of Education on all evaluation 
expenditures and activities . (Chapter 3) 

The annual evaluation report currently required from 
the Department should be expanded to cover all federally 
funded evaluation activities in education, including all 
of those in the Department as well as those carried out 
by state and local agencies. Expenditures at all levels 
should be specified; activities, findings, and their use 
should be briefly described. 



Recomroendation C-5. Congress should authorize a study 
group to analyze the combined effects of the legislative 
provisions and executive regulations that control 
federally funded applied research , (Chapter 5) 

One of the causes of the lack of timeliness and 
relevance of evaluation studies is the accumulation of 
rules and^regulations governing the whole process of 
funding and carrying out applied research in the social 
service area. While almost every provision now on the 
books or enforced through executive practice is there to 
provide some safeguard and may be reasonable when 
considered in isolation, in the' aggregate they have 
negative effects. The tirade-offs between the benefits of 
the safeguards and the obstacles they create against 
producing timely and relevant applied research at 
reasonable cost deserve careful scrutiny, simplification 
and reform may be in order. 

V . / . • 

Recommendations to the Department of Education 

The recommendations to the Department concentrate on 
management issues for two reasons. First, as noted, we 
believe that the quality of evaluations could be 
-considerably improved and^ the u of evaluation findings 
increased through better management procedures • l^econd. 
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the Department has the power to change many of its 
current operating procedures, while it may be able to do 
relatively little about such external constraints as the 
development of the evaluation field, the size of its 
budget, or agency personnel Sellings. The 
recommendations on procedures are organized into those 
intended to develop better strategies for overall 
evaluation planning .within the Department and for 
planning individual studies i those intended to increase 
the quality of evaluations, including three on training 
and technical assistance; and those intended to 
facilitate use. The last three recommendations speak to 
improvements needed in general management procedures. > 



On Evaluation Strategy 

Recommendation D-1. In evaluations initiated by the 
Department of Education, the kinds of evaluation 
activities to be carried out should be specified clearly 
and should be justified in terms of program development 
or program implementation . (Chapter 2) 

This recommendation is analogous to Recommendation C-1 
to Congress. It emphasizes the need to think through 
what tyt>e of evaluation activity is appropriate to. any 
given stage of planning or implementation of a proposed 
program or an existing program. For example, top-level , 
Department officials need to specify what they wish to 
know about a program, why they wish to know it at some 
specified time, and what audience^ other than themselves 
have information needs that must be satisfied through 
evaluation activities. All these needs must be 
coordinated with legislated requests for evaluation. 
(See also Recommendation D-IO on planning.) 



Recommendation D-2. When pilot tests of proposed major 
programs. are conducted, pilot tests of evaluation 
requirements should be conducted simultaneously to 
determine their feasibility and, appropriateness . 
(Chapter 2) 

While pilot tests of a program are being made, it is a 
relatively easy matter to pilot-test the proposed 
evaluation. Such a pilot test can be used to find c^ut' 
what measurements can and cannot be made of program 
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benefits, how programs should account for and measure 
costs, which testing instruments and procedures are 
disruptive and which are not, how large a sample of 
beneficiaries is needed to get valid program 
measurements, and so forth, if a pilot test of an 
evaluation were carried out in conjunction with the pilot cj 
test of a program^ the design of both the program and of 
the evaluation requirements would be strengthened. 

Recommend ation D-3. The National Institute of Education ^ 
should continue and strengthen its program of support for 
research in evaluation methods and processes . (Chapter 2) 

S 

The advances made in the technical aspects of 
evaluation have been considerable, but uneven. The 
Committee believes that too much attention has been given 
to investigating problems in the. use of randomized 
controlled -experiments. Other important probl'ems in 
methodology have not received sufficient attention, for^' 
example, methods for studying the delivery of services, 
for investigating the properties of achievement tests 
when used in the evaluatitjn of programs, and for 
assessing the impact of programs that cannot be studied 
through the usual experimental paradigms. Another 
neglected area of research is the process of evaluation 
its«lf : how studies are' commissioned and initiated, how 
they are managed, what laws and procedures impinge upon 
them. The Committee's work indicates that current 
procedures constrain the. quality and the use of 
evaluations, but how these processes operate is poorly 
understood; therefore, it is dj.fficult to design 
effective remedies. 

On Quality, Training, and Technical Assistance 

Recommen dation D"'4. The Department of Education should 
.provide funds for training programs in evaluation to 
increas e the skills of^ individuals currently charged with 
carrying out or using evaluations and to increase the 
participation of minorities . (Chapter 3) . 

The field of evaluation has grown more rapidly than 
the pool of skilled evaluators. As a consequence, there 
are many people working as evaluators whose training has 
been haphazard and inadvertent and who may not be fully 
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familiar with more recent advances in techniques and 
methods* Others may lack adequate knowledge of the 
educational system'or of the special needs of the groups 
to be helped by federal education programs* 

• A primary training need concerns the 
underrepresentation of minority group members in the 
educational evaluation enterprise* Well over half of all 
education programs target minority group persons as 
recipients of serviced. The Committee believes that the 
quality of evaluation would be improved by the employment 
of minority persons who are also well trained 
technically* For example, intimate personal knowledge of 
the circumstances of beneficiaries will help to define^ 
outcome measures that are more relevant to beneficiaries 
and more closely related to improving the effectiveness 
of programs. Hence, we believe that such perspectives 
should be represented to the fullest extent possible in 
the evaluation of su'^h programs*, Fellowships and 
internship programs in evaluation that include, specific 
priorities for minority group persons would be doubly 
valuable; they would produce good researchers and they 
would enrich the evaluation system* . 

A second concern related to training the 
relationship between the evaluator and the administrator 
or educator* The communication gap be ^en the two that 
inhibits the use of evaluation may be r rowed* by 
appropriate training on both sides* 7 ^tives artd 
program sta'ff would benefit from grew aowledge of the 
language of evaluation and how evaluat on .ght be used; 
evaluators need exposure ^o the problems, procedures, and 
^ constraints ^of federal education programs* Evaluators 
also need to improve interpersonal and communications 
skills in order to convey evaluation information 
effectively. . 

, Technical training for evaluation staff is also 
necessary, both withiA the federal government and at the 
state and local levels* There have never been sufficient 
numbers of staff trained in either rigorous evaluation 
methods or in research, and there have been rapid 
developments in the field* Evaluation is currently 
practiced by those froip almost every type of background 
possible, including many with no more pi;eparation than 
that of classroom teaching* practicing evaluators need 
opportunities to upgrade and improve their skills* As 
one way of meeting this need, the Department should 
consider funding short-term institutes and conference 
providing up-to-ddte knowledge to the evaluation 
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community. (See alao Recommendation D-17 on training 
opportunities for federal staff.) ^ 



Recommendation D-5. The Department of Education should 
structure the procurement and funding procedures for 
evaluations so as to permit more creative evaluation work 
by opening up the process and allowing-^ a period for 
exploratory research. (Chapter 3) 

The more complex the evaluation, the less likely is it 
that one can spell out ahead of time the best methods for 
addressing the questions that the evaluation is designed 
to answer. The current RFP process ,in particular ignores 
this fact. The Committee believes that the RFP process 
can be made more flexible. RFPs for largevstudies should 
include a period of exploratory research; they should 
also provide for side studies that address questions 
integral to the evaluation that emerge after it is under 
way. Proposers should be given the freedom to specify 
al.ternative methods and to suggest side studies. Host 
important, sufficient time for developing proposals must 
b^ allowed. 

Mechanisms other than RFPs for funding evaluations can 
also 'be used to open up the system. For example, 
unsolicited' and solicited proposals, 8-A contracting, 
cooperative agreements, basic ordering agreements, and 
grant awards are each appropriate to given evaluation 
tasks. The Committee's recommendation ^hat a greater 
variety of funding methods be employed does not imply 
that the use of RFPs be drastically reduced. Flexibility 
in the award process, we believe, will permit the 
introduction of new ideas that may contribute to 
higher-quality evaluations. Flexibility will also, allow 
greater participation by minority organizations and 
researchers. 



Recommendation D^6. Allf .major national evaluations 
should be reviewed by independent groups at the design, 
award, and final report stages;. Review groups should 
include representatives of minorities and other consumers 
as well as technical experts. The results of their , 
review should be made broadly available . (Chapter 3) 

This recommendation also is intended to open up the 
process^ There are three facets to it: improving the 
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technical quality of evaluations, assuring early 
contribution and involvement from those most affected by 
programs (beneficiary groups, teachers, etc.)# and making 
use of findings more likely through public exposure and 
understanding. 

When the RFP process is used, the agency itself should 
solicit as much outside advice as possible, through 
development of concept papers, planning conferences, and 
other pre-RFP activities, proposal evaluation and 
selection procedures should include experts from outside 
the sponsoring agency. After award of a contract r the 
contractor also should solicit the views of outsiders. 
Then, when the project is done, outsiders should again 
review the work, its assumptions, its technical 
ambiguities J and its policy implications. Reviews of 
completed work should be widely disseminated in order to 
encourage discussions cf the findings. The Department 
might sponsor an annual conference on important 
evaluations that are at various points — design, 
completion of final report, reanalysis. If this were 
done, the educational community would know where to look 
for the latest evaluation results and criticisms and be 
apprised of impending work. 



Recommendation D-7. All statistical data generated by 
major evaluations should be made readily available for 
independent analysis after identifying information on 
individual respondents has been deleted . (Chapter 3) 

When possible, ethnographic data and case study 
material, similarly treated to protect privacy and . 
confidentiality, should also be made available. 

Making primary data from evaluations available will 
require support in major evaluation contracts for 
documentation, storage, and dissemination of data and the 
creation of explicit agency policy on access to data. 
Since the objective is to generate adequate examination 
of the methods and findings of major evaluation studies, 
independent review aAd reanalysis should be supported by 
the Department as part of its evaluation and research 
progr2ims. 
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Recommendation D«>8, The Department of Education should 
explore alternative approachea to technical aBsiatance 
for Btate and local evaluation needs , (Chapter 3) 

The technical assistance needs of state and local 
agencies are not uniform. They vary with the size of the 
agencyi the sophistication of the agency's evaluation 
staff, and with the complexity of the federal program 
activity in the agency. The technical assistance centers 
associated with Title I are one approach to meeting such 
needs. Another approach would be to identify or fund 
exemplary models of monitoring and reporting and to 
disseminate the procedures involved. A third approach 
would be to develop the capability of state agencies to 
. provide tjechnical assistance to less sophisticated local 
agencies. 

Technical assistance should also cover organizational 
and personnel issues. In particular i state and local 
agencies need to be aware of the desirability of 
separating an evaluation unit from program administration 
in order to avoid conflicts of interest. ^Work already 
done by some state and local agencies on*optimal 
institutional arrangements i personnel requirements* and 
procurement policies for extramural work can form the 
basis of advice and assistance to others, (see also 
Recommendation D-16 on minimum requirements for 
monitoring and compliance reporting.) 



On Utilization 

Recommendation D-9. The Department of Education should 
teat various mechanisms for providing linkage between 
evaluators and potential users . (Chapter 4) 

The Department should consider establishing a unit 
charged with studying i developing, and instituting 
knowledge transfer mechanisms and evaluating their 
effectiveness. Alternatively! outside experts might be 
charged with this responsibility. Appropriate activities 
would include assessing proposed dissemination plans i 
performing needed translations of evaluation reports i 
funding research on the communication and use of 
evaluation informationi and developing procedures 
designed to improve the day-to-day use of evaluation 
data, at least within the Department. 



Recowmendation D-10> The Department of Eduoation should 
inatitute a flexible planning system for evaluations of 
federal education prograroa , (Chapter 4) 

A workable planning system must provide for 
appropriate information to be available for recurring 
legislative decision cycles on education programs; it 
must accommodate an ongoing program of evaluation studies 
addressing problems that are poorly understood ^ and it 
must be sufficiently flexible to allow response to 
interesting but unanticipat((^d questions that arise as a 
result of ongoing research, changes in policy r or 
developntent of new programs. The evaluation plan for any 
major- education program should contain a series of linked 
studies, some of which furnish factual information in 
reasonably short time and some of which address issues of 
long- term interest. 

Although planning does not necessarily lead to an 
agenda that is subsequently carried out in detail, 
planning almost always leads to an improved sense of 
priorities r provides a forum in which conpeting interests 
can reach accommodations ^ and induces an active as 
opposed to a reactive stance toward essential activities. 



Recommendation D-11. The Department of Education should 
establish a quick-response capability to address critical 
but unanticipated evaluation questions , (chapter 4 ) 

In order to be fully responsive to the information 
needs of its primary audiences, the Department must be 
able to combine a deliberative planning process that 
allows time for field and constituency involvement with a 
quick-response capability that can address unanticipated 
but critical evaluation questions as they arise. 
Department staff charged with evaluation responsibilities 
should be able to respond within 2-6 months to 
evaluation-related questions to which Congress ort. 
top-level Department officials seek prompt answers. 
Several extramural mechanisms are available for this 
purpose, for example, maintaining lists of prequalified 
contractors who can be given specific task orders on 
short notice or using 8-A contracts and awards to 
SBA-eligible firms. 
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Recommndatlon 0-12, The Department of Education ahould 
enaure that evaluatlona deal with topioa that are 
relevant to the llKely uaera ^ (Chapter 4) 

In order to Inoreaae the relevance of evaluation 
reaulta, primary audience (s) muat be specified prior to 
the beginning of a study. When condltlona change during 
the course of a study that might affect the usability of 
the findings r study objectives and design should be 
reconsidered to ensure that the study will remain 
relevant. Efforts should be made to deliver reports on 
time, especially when study results are Intended for 
decisions that are made at specified times. 



Recommendation D-^13. The Department of Education ahould 
ensure that dissemination of evaluation results achieves 
adequate coverage . (Chapter 4) 

All RFPs and grant announcements should Include 
requirements for a dissemination plan oriented t*- tard 
utilization, and proposal evaluation should give 
appropriate weight to the quality of the proposed 
dissemination plan. Dissemination plans should Include 
specification of audiences and their Information needs, 
strategies for reaching the audiences, provision for an 
adequate number of report copies and other materials, and 
mechanisms for adapting the dissemination plan as the 
study proceeds. Budget negotiations should recognize 
that adequate dissemination Is costly and cannot be an 
afterthought. 



Recommendation D*14. The Department of Education should 
Observe the rights of any parties at Interest and the 
public In general to Information generated about public 
programs . (Chapter 4) 

Findings from evaluations must be made available to 
those who .are Importantly affected by the programs being 
evaluated. Including those who manage them, those who 
provide program services, and those who are Intended to 
benefit (or their representatives)'. Since evaluations 
are paid for with public funds, ''they should also be made 
available to the public at large. The Committee Is aware 
of the dangers In providing too much autonomy to 
evaluation units and contractors, but public Interest 
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naeda suggest that, At the dissemination stage f 
evaluatora should be guaranteed a certain degree ot 
autonomy. Appropriate changes should be made in contract 
^provisions to allow contractors and grantees the 
necessary flexibility with regard to distribution of 
reports and other dissemination stretegies. 



Recommendation D-15* The Department of Mucation should 
give attention to the identification of "r ight-to^know" 
user audiences and develop strategies to hmot their 
information needs * (Chapter 4) 

Perhaps the most neglected audience for evaluation 
studies consists of program beneficiaries and their 
representatives. We believe that this neglect is not so 
much intentional as it is produced by the very real 
difficulties of defining this set of audiences in a 
reasonable way. In order to more closely approximate the 
ideal that all those having a recognized interest in a 
program should have reasonable access to evaluation 
results, the Department should consider dissemination of 
evaluation report? freely to groups and organizations 
that claim to represent major classes of beneficiaries of 
education programs. Positive, active dissemination to 
such r ight-to-know groups may include such specific 
activities as ascertaining their information needs prior 
to evaluation design and during the evaluation, preparing 
standard lists of groups and organizations to whom 
evaluation results are routinely disseminated, and 
seeking out comments and critiques of evaluation 
reports* Since it is to be expected that such 
right-to-know groups will be different for different 
evaluations, careful consideration of the appropriate 
right-to-know groups should be part of the dissemination 
plans that contractors are asked to prepare as part of 
their response to RFPs and grant announcements. 
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On General Management 

Reoommendatlon d-16. The Department of Education should 
clearly spell out minimum reguirementa for monitoring and 
compliance reporting and set standards for meeting the 7 
requirements , (Chapter 5) 

Such data items as distribution of funds, number and 
types of beneficiaries being served, and specific program 
services should be defined by the Department so that 
local and state agencies will know exactly what reporting 
is required of them. Quality control procedures should 
be enforced so that adequate performance reports uan be 
made to Congresq. Before setting the requirements, 
however, the Department needs to examine its own capacity 
to deal with local and state reports in order to avoid 
collecting information that is never used because of the 
sheer inability of federal staff to deal with the volume 
of reports. The objective of this recommendation is to 
improve the quality of data needed for accountability 
without increasing the burden of response on local and 
state agencies. To accomplish both ends, admittedly 
somewhat difficult to reconcile, the Department should 
consider appropriate development research on what kinds 
of procedures would minimize response burden and at the 
same time ensure sufficient data quality. 



Recommendation D-l?. The Department of Education should 
examine staff deployment and should establish training 
opportunities for federal staff responsible for ^ 
evaluation activities or for implementation of evaluation 
findings . (Chapter 5) 

The Department should consider alternative, ways of 
usingi the technical staff within the central unit and the 
evaluation staff in other units. The greater the degree 
of government involvement in an activity) the greater the 
skills and the greater the number of perspnnel required: 
grants and consultancies entail the. least involvement, 
contracts and evaluation teams configured of government 
staff and outside experts more, and in-house studies the 
most. The Department should examine the number and types 
of positions assigned in light of responsibilities and 
workload, it should also examine the academic and 
experience background of personnel charged with 
evaluation responsibilities. Such personnel should be 
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Will groundtd In the theory and methodology of relevant 
aocial acience diaclplineai they ahould be aware of the 
perapectlveB of the varioua partiea at intereeti and they 
ahould have practical program knowledge! Suitable 
training programa should be made available to prepare 
staff members adequately for their taaKst 



Recommendation D-IB. The Department of Education should 
take steps to simplify procedures for procuring 
evaluation studies > carrying ^ nem out, and disseminatin*^ 
their findings . (Chapter 5) 

The Committee is aware that our recommendations for 
opening up the system and for involving minority groups 
and other parties at interest <?juring various phases will 
complicate and prolong the evaluation process • However, 
we firmly believe that this can be more than compensated 
for by siroplifylncj and improving internal management 
procedures now us«d by the Department. 

The procurement process has become not only 
restrictive and inflexible but very costly in internal 
staff time and to .propoi&ersy Uiough the cost to proposers 
is recouped eventually through overhead and in other 
ways, so that the government bears the double burden. 
Other sources of delay, once a contract or grant £or a 
study has been awarded, must also be identified and 
addressed. This applies particularly to clearance 
procedures and to monitor and i genry handling of requests 
for changes in study dc- ign, sampling procedures, 
testing, analysis, time frame, and the like. The 
, Department should consider sanctions and incentives to 
encourage timely perfcimance, and it should hold itself 
responsible for timely dissemination. 

Our call for timely performance on studies that are 
intended to feed into a specific legislative or 
management decision in no way invalidates the need for a 
more deliberative approach in c;ertain cases. There are 
tiroes, especially when an effort is being roade to reroedy 
a problem that is little understood, when it is more 
important to promote a variety of studies that explore 
emerging leads than to mojnt a formal study designed to 
provide a definitive aaswer by a specified date. Even in 
such cases, however, the pace should be set by the 
research process and concerns for its quality rather than 
by overly cumbersome management procedures. 
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Introduction 



' BACKGROUND 

In the broadest aenae^ evaluation has alwaya been done. 
In its more narrow mpclern uaage^ "evaluation" haa coroe to 
mean the use of recently developed research tools and 
concepts of the sooial sciences to develop evaluation 
knowledge. What has social-science-based evaluation 
contributed to education? Two examples, one of national 
scope, the other local, illustrate how such evaluations 
illuminate and sometimes contradict judgments derived in 
other ways; they thus increase knowledge about what 
affects the educational process and how it in turn may 
affect educational and social goals. 

In 1959 James B. Conant published his widely read 
report on the American high school, r^w^ending, among 
other things, the consolidation of small high schools 
into large comprehensive schools and an increased 
emphasis on English composition, mathematics, and 
acieiice. His report, based on visits i:o several doz^n 
high schoQls, was essentially the application of his 
judgment as an experienced educator to what he saw as 
typical practice in better schools in comparison with 
less adequate schools. He concluded that, in the better 
schools, students were learning more because the 
curriculum offered to them was better, there was a wider 
variety of courses, teachers were better, facilities were 
better, the counseling was better, and so on through a 
list of characteristics generally associated with 
comprehensive high schools. Hence, Conant concluded that 
such' schools contributed to the learning achieved by high 
school students. Whatever influence Conant 's report had 
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on Amerloan eduantlonr it w«tt ,aert«lnly widely ceaa and 
diaauased at the time, undoubtadlyr tha capoct haatanad 
the prooeaa of aahool diatriot oonaolidation thd^t wa» 
already undor way and helped the amphaala on aoademic 
achievement that waa also aided by the Sputnik 
aacompliahmenta of the Ruaaiana during the aame era. 

In a broad senae of the word, Conant»a volume 
constituted an evaluation of our school system; however; 
it waa not an evaluation , in the senae used in this report 
beoauae the means by which ConAnt came to his 
recommendationa were not based on the concepta and tools 
of social science. He generalized what he found to all 
schools even though there waa no evidence that the 
achoole he studied fairly represented all American high 
schools. Nor did he collect information on the schoola 
and students in a sufficiently structured way to allow 
replicatioh by other observers. In short, Conant and his 
colleagues did not follow the procedures of ethnography^ 
sample surveys, or experimenter si the procedures used 
were essentially those of high-level journalism. But, 
moat important of all, Conant 's observations were not 
social science because he did not consider alternative 
explanations for differences in quality among the more 
than 100 schools that he and his collaborators visited. 
Were his "better schools** better because of their 
curricula, staff, and amount of per-capita student 
support, or were they better for some other reason? 

In contrast, the later work by James S. Coleman and 
his associates (1966) is clearly an evaluation in the 
social science sense. His sample of 469 high schools and 
959 feeder elementary and junior high schools was chosen 
by probability methods to represent fairly the (then) 
21,000 high schools in the United States* Achievement 
tests were used to.,measure the learning of large samples 
of thousands of students selected from various grade 
levels within the sample schools. In addition, 
principals and teachers were queried about their own 
professional preparation and about the relevant 
facilities available within each school, such as library 
size, physical education facilities, and age and size of 
buildings. 

While therie were clearly some high schools that 
appeared to be fostering higher levels of academic 
achievement among their students, Coleman also considered 
alternative explanations for school differences, among 
which the most important were family background and 
community differences^ among students. His analysis 



ihowtd that Qh«r«QUrlitlQM of aqtioolai tuAQhurai ar 
prinaip«l» gount«d v«iry Uttlt In oomp^riBon with CamUy 
bwKground, Ind«t4i th« m«jor diee«r«mc« b«twa«n nahooip 
w«« ftogounttd Cor by th« dieettc«naM In th« in4)<M ot 
itud«nti «com vnrlomi bAoKgroundAi with iohooi CAolXlti^a 
and finanoiai •Mpvndltucta Also oountlng for vary 
iittif . ThiA finding profoundly Ahooked th« fitXd of 
•duoAtion, Th« mAin policy implioAtion of tha finding 
WAA thAt ahAnging th« AOAdemio AohiAVAmont of ohildren 
through chAnging tha aohoola waa not going to ba An aAay 
job antAiling maraly ohAngaa in ourrioulAr upgrading of 
tAAoharar or providing more finAnoiAl aupport to tha 
aohoola I 

Tha importanoa of taating altarnative axplanationa ia 
ahown aa dramatically in a racent study (Robartaon 1900) 
of tha affaot of dropping drivar adubation from tha 
curricula of aome Connaoticut high achoola. In 1976, the 
Connecticut atata legialature decided to discontinue 
aubaidising driver education in the state's high 
schools* In response, some of the high schools dropped 
driver education entirely from the curriculum while some 
retained it, financing the classes from local funds. ' 
Robertson tested the impact of this change on automobile 
accidents involving young persons aged 16 and 17 by 
comparing the number of accidents in counties in which 
driver education was retained with counties in which it 
had been dropped. He noted that over a 2'-year period ^ 
the number of acdidents involving p/irsona aged 16 and 17 
declined drastically in the communities that had dropped 
the course. 

It would have been easy to conclude that driver 
education was not efficacious in training careful 
drivers, or even that it produced more reckless drivers, 
but Robertson/ tested a number of reasonable alternative 
explanations. The most plausible of these alternatives 
was indicated by a drop in the number of drivers aged 16 
'and 17 in those communities that dtopped driver 
edification. In short, in communities in which driver 
education was part of the curriculum, young people 
received their driver's licenses at an earlier age and 
hence, there were simply more people aged 16 and 17 who 
drove. If driver education courses do not lead to a 
reduction in the number of accidents* for 16- and 
i7-year-olds, it is not because they are not 
educationally effective (we cannot draw conclusions about 
this one way or another from Connecticut's natural 
experiment), but because they encourage more people of 
that age to get licenses. 
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SinQf Col«m«n»« UndmmK wocK, d^miind tot iv«m«tlQn 

hikvt uttn n k)ur9«oning of pwbllo ppp9r«m« «un{l«4 And 
miinagid throw^h th« (•dtpiil gavtrnmnnt. Th* int«nfc op 
iUQh pK09r«m« hAH bttn to 4iUtvi«tA a widt varitty o^ . 
aiifQl«tAl probltmRf from untmploymont to low c«AdinQ 
iioocAA ot »oni« qhlldctn In public iOhoolAi from 
AUbAtAndAcd houiing to CAoidlvlim ot JilonAi from duug 
Addiction to tht InAdAqHAolAA And ln«quitlA« of tht 
hAAlth OACA •ysttm« But as a numbtr ot tht pcogcAmA 
eAlltd to llvA up to tht AxptotAl^lonA thAt AOOompAnlAd 
tntlc orAAtlon, tvin aa thtlc ooAtA AAOAlAttd, qutAtionA 
WACA cAlAAd AA to tht CAAAonA Cor tht dlAAppolntlng 
ptcCocmAnoA. In cAAponAA# CtdtcAl AgAncltA hAv« 
BponAorAd And oonduottd a dlvtcAlty ot tvAlUAtlon 
AOtivitlAAi obllgAtlng ^ntAdy a quACttc oC a billion 
dollAti tot thAt pucpoAt In ClaoAl 1977 And InvAAting 
mort thAn 2 #000 8tA££ ytACA on th« pAct oC pAcmAntnt 
fAdtCAi: evAluAtlon atAff (OCCloe of MAnAgAmtnt And Budget 
1977). 

NowhACA hAA tht growth of pcogcAma AOQompAnlAd by tht 
growth of tvAluAtion bten moct pconounoAd thAn in thA 
tlAld oC AduoAtion. The £AdACAl pACt o£ public aohool 
inoomA gcAW from 4.3 pAroAnt in 1962 to 8,5 pACCAnt in 
1974| fcbm $1.6 billion to $6.6 billion (In oonstAnt 
1977-78 dollACs) . The moat rapid InocAaaa oama In thA 
inl<l-1960B; by 1966 the federal oontclbutlon stood at 7.9 
paroentf oloae to the current level (Dearman and Pllako 
1979). The increasa was largely the result of the 
landmark Elementary and SacondAry Education Act (BSEA) of 
1965 (reauthorized and Added to several times since, most 
recently in 1978), which mandated a number of federally 
funded programs to improve the school performance of 
disadvantaged children. Title I, which supports 
compensatory education for poor children, was, and 
continues to be, the keystone program of this 
legislation. To date, more than $26 billion in federal 
funds has gone to state agencies and local school systems 
unifier Title I (Kirst and Jung 1980). 

Evaluation Activities lagged a few years behind, 
though the first legislative requirement for evaluation 
was built Into the original Title 1 legislation. Byj the 
time the program was 7 years Old, more than $50 million 
had been spent to evaluate it (McLaughlin 1975). Current 
federal investment in evaluation of education programs 
totals some 340 million a year (see Appendix A) , not 
including federal funds speht for evaluation at the state 



37 



b««n 10 MUbliih whftthtr praqramjii «r« In aoneoi?m«nof 
with ItgliUblvt prpvlitonif whtthtr pragr«mt trt numn^td 
•(fiioblvtlyf «nd whtthtr progrimt trt tohltvlnq bh« 
dtilrt<J gotlti It Wit tttumtcl thtt tvtluatlpn woul(l 
tnawtr thoat qutationt tnd^ moft^^/tr^ ppovldt InJormttion 
thit qould bt Uitd tQ rtmtdy idtntlfltd dtCigitnaittt 

But AQhitvlnq tvtluttlont thit yltld Antwtrt hat bttn 
ti tlutlvt «t tchliving tuaotttful progrtmt. atrly 
tvtluttiont Ctgtd ttuhniptl probltmt tnd Ctlltd to 
tntiplpttt tht highly politlQlitd oonttNt that turroundtd 
tht progrtmt bting tvtlwtttd. m tvtluttova Ittrntd to 
Qopt with aomt of tht tarly probltma^ mora tvaluationa 
wtrt Cundtdi and In 1970 tht oeeiot of Bduoatlon (QB) 
tatabllahtd a otntral tvaluatlon unit (att Apptndln A) 
and plaotd at Ita htad an tvaluator of aomt ataturt. tut 
odtlQlam haa not abattdt Thoat who aponaor tvaluatlont 
or act In a poaltlon to uat thtm oontlnut to volot thtlr 
dlaappolntmtnt, ofttn finding ^ctauXta IrrtXtvant ov not 
dtllvtrtd In tlmt for making dtaltlona on pcogramat 

Btoauat of tht thaoratloal and ttohnloal pcobltma arid 
baoauat of qutatlona on Ita contribution to formulating 
aoolal polloy, tht f Itld of tvaluatlon haa bttrt marktd by 
a oontldtjcablt amount of atlf^-lnaptotloni A largt numbtc 
of atudltt and booKa havt bttn dtvottd to analyalng 
tvaluatlon, gauging Ita tf ftotlvtntaa wlth^ rtaptot to 
malting policy dtclalona, dtvtloplng Improvtd mtthodology, 
and appralalng tht quality of Individual atudlta. For 
txamplty a rtcent rtvltw of program tvaluatlona (Boruch 
and Cordray 1980) oltta mort than 150 rtftrtncta dtvoted 
to orltlquta and analyata of Individual studlta or of the 
fltld In gtntrali another rtctnt oomprthtnalvt ovtrvltw 
(Cronbach at al. 1980) cltea nearly 200 such raftrtncta. 
And both thtat worKa concentrate largely on the field of 
evaluation In education. 

Many of the publlahed articles and booKa Include 
recommendations for Improving evaluations and making them 
more effective. Yet as the field has grown and consumed 
a more visible share of resources* the numberof 
questions on the quality and utility of evaluations has 
Increased. The latest expression of dissatisfaction came 
from the Congress in 1978 with the re^uthorlaatlon of 
ESBA (P.L, 95-561) t It was a congressional demand for . 
Improvement In the methods, Integrity, and uses of 
evaluations, which led the Office of Education to 
commission the present revlew^of Its evaluat-lon 
Activities by t:he ^Committee on program Evaluation In 
Education. " 
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uMill^nP^iN fur It^s r«H)*;^Fti W« iii^intlJ ^Un Mjpr 
^u^inna^fii Alt int^mt^^rn ot Pangr«i»» ^nd l;h^ir ^tf^H tin 
Ntinlpr #KiiQu(;iv#» within th^ nnw P«p«rtin9nt of l^dMQfti^iQn 
tm two r««iiiPn»« l^ii^'^tf th^^n two groMpn hA(4 m^d^ 
ispidotftq pQinpUintfl «l^a^t th«i <»f:NQtlv^n«ifla ot pro^r^m 
mvAlMAtion und hisid ^aKtd Ear ytPommond^tipnei pn 
iinprovomnntf fl«iwnd| mo^t oi^ thu Ut;«r4iturii Ananfii^iny 
thd Ci^ld q£ 0VAlmitlqn in ^ddr«a»«r) to it« 
pri)atltlpn«itiif r«ith<ir than tQ th« »p^m»oc» «ind pot^intl^l 
ua«ir» Q^ iVAluAtlon^i In th^ Coinmiti^#(»*» viciWr thu 
grlticml i«ll?-"inap«ation th4t uh4CAQt«friii«id th« 
ttVAluatlon fiiold han b««n a miiinaprlng tht ()«v«lapin«nt 
ot thli r«th«r young branch appliad aoaial aolanoti 
WhlU auoh oritioiam muatrn36ntinutt to provldt oorraotivaa 
to dtCiolant thaory and praotloa (and to ba aCCaotlva, 
n\uat apaak to Ita own apaolallat audianoaa) r it will 
oontinua to miaa tha marK tot thoaa outaida tha oirola ot 
**aHp«rta**"--"tha vary individuala and groupa who maka 
daoiatona about aooial programa and who ara In a poaition 
to oommiaaion and uaa avaluationai Thia raport ia 
primarily addraaaad to thantr and our raoommandationa ara 
for tha lagialatora and tha aganoy axaoutivaa who aaak to 
obtain graatar alfCaotivanaaa and uaa from invaatmant in 
program avaluation in eduoation. 

In addition to our main audienoaa, wa baliava the 
raport will alao be olf intaraat to aaveral other 
audianoea. Ona auoh audienoa inoludaa state and looal 
eduoation authoritieaf who carry out evaluation 
aotivitiea w^th federal education funds. In aome 
inatanoeSf our reoommendations concern them directly; but 
even when thia ia not the case, they have a ataKe in |^ow ^ 
evaluations are commissioned and carried out at the 
federal level because the programs being evaluated are 
the responsibility of state and local agencies* Groups 
concerned with assuring that federal education programs 
meet the goala intended by the legialation are another 
audience. An improved evaluation system will provide 
inforpiation to carry out their oversight function more 
effectively. In partioulari such information is oriticalr 
to groupa interested in^ furthering equal educational 
opportunity, the goal of most federal education programs 
and mandates. Lastly, though we have ro^de no effort to 
, address problems from their particular j^erspective, 
researchc-.'S involved in carrying out evaluations are an 
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audience for our recommendations since we intend those 
recommendations to have an impact on how evaluation is 
done and used. 



SCOPE OF THB REPORT 

Among researchers i the term "program evaluation* 
traditionally has been applied to the assessment of the 
impact of a givei) program. Generally, this has included 
answering two kinds of questions t To what degree have 
the changes intended by the program been achieved? To 
what extent can the observed changes be attributed to the 
program? Early in the Committee*s prbceedingsi however, 
it became clear that this definition was too limited for 
our task and for the audiences. of this report. In the 
pragmatic environment in which questions are framed about 
federal education programs i distinclbions between outcome 
evaluations — those concerned with the above 
questions— and other types of assessment are frequently 
irrelevant. Congress and Department officials need to 
kniow how funds are allocated i what kinds of ^ogram 
services are being delivered to whom, how mankgement of a 
program could be improved, what program alternatives are 
most effectiveir and which programs are most 
cost-efficient, in developing new programs or changing 
existing ones, questions must be answered about the 
nature and extent of the need to be met and about the 
effectiveness of proposed programs to meet that need. A 
considerable proportion of th^i^uWcis allocated to 
evaluation of federal educatio^^programs goes to answer 
Such questions I and even studies concerned mainly with 
program outcome include activities (and money) devoted to 
those other issues. From discussions with congressional 
and Departmental staff, it was evident that the 
dissatisfaction with evaluation encompasses perceived 
shortcomings in all areas and that focusing only on 
pirogram evaluation as defined by the research community 
would not address the concerns of policy makers. 
Therefore, th^ Committee has chosen to be inclusive with 
respect to the donuiin of its inquiry. The terms 
"evaluation activities** and "evaluation, ** as used in this 
report, cover work undertaken to answer any^ type of 
assessment or planning question having td do with the 
allocation of benefits, the nature of services, the 
outcomes, or the management of an established or proposed 
program. But we have not given equal attention to each • 
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type of evaluation activity; we have concentrated on 
thorie activities for which the methods of applied 
research can make the greatest contribution to policy 
formulation. 

While the Committee has used an inclusive definition 
of evaluation, it has concentrated its attention on a 
limited number of issues, namely those of greatest 
interest to the primary audiences. Congressional concern 
with uniform methods and measures is addressed in Chapter 
2 in the context of delineating different types of 
evaluation procedures and their appropriate use. issues 
of integrity and independence are treated as part of the 
discussion in Chapter 3 of how the quality of evaluations 
can be improved. Follow-up on evaluations, the third 
issue stated explicitly in the congressional request that 
led to our study, is subsumed under the more general 
topic of the use of evaluation results, which is 
considercid in Chapter 4. Finally, Chapter 5 responds to 
the specific request made by Department officials to 
provide recommendations on the organization and 
management of evaluations funded with federal education 
funds. The recommendations and suggestions in Chapter 5 
also take account of implications for management and 
organization that derive from the discussions in the 
preceding chapters of evaluation procedures, evaluation 
quality, and the use of evaluation results. 

The report documents some of the ways in which the 
evaluation system in education currently operates and the 
incentive structure implicit in its operation.. The 
Committee makes a number of recommendations that, in our 
view, would improve the current system. We suspect that 
the effective implementation of the recommendations will 
have to take into account the incentives of legislators 
and upper-level managers in the Department as well as 
those of lower-level managers, contractors, and potential 
and actual beneficiaries. Time did not permit a thorough 
examination of how incentives might be restructured; 
instead, we have largely focused on recommendations that 
appear feasible within the present incentive system and 
that we think can produce improvements in the quality and 
usefulness of evaluations. 

Some issues th^t are the subject of much debate within 
the evaluation community have been given only passing 
attention in the report, such ass the choice between 
quantitative and qualitative methods; the relationships 
between those who sponsor evaluations, those who carry 
them out, and those directly involved with the programs 
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being evaluated; and a number of technical, matters 
relating to effective collection of data and appropriate 
analytical strategies. Deemphasis of such topics was not 
just a matter of lack of time; it reflects the 
Committee's view that those topics are less important to 
our main audiences and that (particularly in the case of 
technical issues) the Committee would find little new to 
add to the extensive literature in the fteld. 

#&ur additional issues pervaded the discussions of the 
Committee, though they had not been identified 
specifically in the 1978 legislative provision calling 
for the assessment of OE's evaluation activities, by 
legislative staff interviewed, or by Department 
officials. Of these, the most important surfaced during 
the very first meeting, namely, how well evaluation 
activities address the broad federal mission of equal 
educational opportunity. To do so effectively requires 
the active participation in the whole evaluation process 
of minorities and other groups intended to benefit from 
federal education programs—from the planning and design 
of evaluations to their ultimate use. ,The inadequate 
consideration of the needs and viewpoints of the groups 
intended to benefit from programs affects the kinds of 
questions asked about programs, and insufficient ^ 
information about the results of evaluations prevents 
such groups from knowing how to make programs more 
effective. 

The second issue developed as the Committee pursued 
its questions about the current process of commissioning 
and carrying out evaluations in education. As a result 
of external regulations and constraints and internal 
procedures, the process operates so as to limit severely 
the flow of ideas and creativity that must be part of any 
effective researci^ effort, including applied research 
such as program evaluation. The conditions that have led 
to this undesirable state admit of no easy remedy, but 
measures must be taken to open up the process if good 
evaluations are to be carried out^., Opening, up the 
process is also necessary ""in order to have greater 
involvement by minority researchers and organizations. 

A third issue also bears on quality and equal 
opportunity, namely, the training of individuals involved 
in evaluation, either as performers or as users. The 
Committee is not advocating an expansion of the field of 
evaluation, but we are concerned that federal mandates 
for evaluation generated both by Congress and by the 
Department (and its predecessor) have forced individuals 
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with Inadequate preparation into evaluation^ particularly 
at the state and local^levels. One remedy is to 
reexamine current evaluation requirements and reduce them 
where they are not warranted; a second is to provide 
training and technical assistance as necessary, 
opportunities for training can also deal with the 
purported shortage of minority researchers and remedy 
specific shortcomings among federal evaluation and 
program iitaff. 

A foui'th issue became evident as the Committee 
reviewed the major themes and recommendations of the 
report. Unless the limitations of evaluation are clearly 
recognized^ disappointment will continue. Ideally^ 
evaluators are objective and accurate^ reporters who can 
provide and interpret detailed information about a 
program, in reality^ they may be asked to act as judges 
or as support personnels or they may be perceived as a 
necessary but unwelcome program disturbance. As judges r 
the verdicts of evaluators may be considered uninformed 
by program managers and clients when the evaluators come 
from outside the program and biased when they come from 
inside. As support personnels their findings and advice 
may conflict with accepted assumptions^ policies » and 
procedures. As researchers^ the constraints on 
resources^ on freedom to design evaluations^ and on 
access to information may sharply limit their ability 'to 
investigate some critical questions. Evaluators often 
must negotiate with various parties at interest-**-the 
evaluation sponsors; the federals stater and local 
program managers; teachers and principals; parents and 
students— -providing some service of value to each in 
exchange for resources (a program manager's time, a 
sponsor's money) and cooperation. And even when an 
evaluation has proceeded successf ully# the results must 
enter a communication stream that contains many other 
messages. Evaluation does not and cannot eliminate the 
need to manage controversy; at best# evaluators and their 
work serve to produce knowledge that can inform decisions 
about programs, decisions that must continue to be made 
through political and managerial processes. 

Though the report is organized into chapters according 
to the -'topics of greatest concern to our two main 
audiences, the four issues of equal opportunity, opening 
up the process, training, and the role of evaluation are 
woven throughout the text of the chapters. We believe 
that addressing the first two of these issues is 
indispensable to increasing the effectiveness and 



.33 



quality, as well as the uses of evaluations; 
recommendations relevant to these issues are made in 
several chapters* Recommendations on training and 
technical assistance appear in the two chapters dealing 
with the quality of evaluations and the organization and 
management of evaluation activities* As to the fourth 
issue, we hope we have been sufficiently sensitive 
throughout our work to both the importance and the 
limitations of the evaluator*s role, even though 
constraints of time and space have precluded the full 
discussion that this issue deserves* 

A question that, surfaced several times during the 
Committee's deliberations concerned the appropriate size 
of the federal investment in evaluation relative to the 
federal investment in education programs themselves. 
Depending on what activities are included as evaluation, 
some 0.3 to 0.7 percent of total federal education funds 
are currently spent on evaluation* Several individual 
programs have legislatively established ceilings for 
evaluation activities sponsored at the national level 
(0.5 percent of program funds for ESEA Title I, 1 percent 
for Emergency School Aid Act programs), and there are 
provisions for the funding of state and local evaluations 
within some mandated set-asides for administr^itive 
expenditures. For large programs, a 0*5 percent 
set-aside for evaluation will yield a sizable pool of 
funds if invested at the national level, but it may be 
inadequate if parceled out at ^the individual school 
system level; foe smaller programs, it may be reasonable 
to spend as much as 10 percent of total program funds 
(see Appendix C) . Limited questions about accountability 
can be answe«;ed relatively inexpensively, but to try to 
answer complex questions with inadequately funded studies 
may turn out to be a waste of resources* The Committee 
considered current funding provisions and spending 
patterns and makes some recommendations regarding them, 
specifically that evaluation funding be separated from 
administrative costs and that complex and costly 
evaluations not be undertaken without adequate 
resources. But we do not see it as our role to determine 
the proper size of the total pool of funds to be devoted 
to evaluations The allocation of resources between 
programs and their evaluation depends on the importance 
assigned to the flow of program funds to beneficiaries 
compared to the importance of gaining knowledge about the 
programs and accounting for their effects. This 
determination is largely a matter of political judgment* 
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Instead of attempting to determine whether the current 
level o£ spending on evaluation is too much, too little, 
or just right, the Committee has focused on how those 
funds that are allocated to evaluation can be spent more 
effectively and yield more useful results. / 



Defining Evaluation 



THE ROLE OF EVALUATION 

The literal meaning of the verb "to evaluate" is to 
estimate the value of some object or activity. As 
applied to education programs, evaluation includes the 
set of activities that are aimed at finding out how 
valuable a program may be. Relevant questions include t 
How serious is the condition that the program is designed 
to ameliorate? How is the program supposed to work? 
What would happen without the program? What would happen 
if the program were expanded? How valuable is the 
program compared to other programs? 

Putting things this way makes it very difficult to 
question the value of evaluation. How can one be for not 
knowing the value of a program, its impact on this or 
that, or what muld happen if it were altered? How can 
one favor making budgetary decisions in the absence of 
evaluation information of some sort? in short, how can 
one opt for ignorance over knowledge? 

Although the need to know seems indisputable, 
controversy and struggles inevitably arise whenever 
social-science-based evaluations are done and reported. 
Pirst« such evaluations make program goals explicit and 
thus may uncover previously hidden value disagreements. 
Second, they have to compete with other forms of 
evaluation~ad hoc opinions, skillful journalistic 
reporting, intuitive perceptions, and so on. Third, the 
evaluation process is rarely clear cut or simple: a 
given program can be evaluated using a variety of 
alternative research methods, and results are often 
subject to competing interpretations. For thesr reasons. 
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evaluation through social science methods tends to be 
politicized! it cannot help but be influenced by 
political tides, varying ideological perspectives, 
personal goals and inhibitions, technical limitations of 
methods used, economic priorities, etc* 

A special difficulty for evaluation is the fact that 
scarcely anyone likes to be judged, and those who run and 
operate programs or benefit by them are especially likely 
to react defenaivelyj to such judging* Even if the 
results of the evaluation may be favorable, the scrutiny 
is difficult to tolerate. There is always the concern 
that one's behaviors, attitudes, and beliefs will be 
misinterpreted and distorted in a professional language 
that is incomprehensible or presented in a form that robs 
one's individual identity* But beyond the personal 
concern that one will be misunderstood or misinterpreted 
is the recognition that evaluation^ necessarily represent 
some particular point of view and reflect specific value 
positions* By their very nature, evaluations are not 
neutral* Judgments are made based on implicit or 
explicit assumptions about what a program, is and what it 
should be* To those running a program or benefiting by 
it, evaluators' judgments are often considered external 
to the program and hence inappropriate* 

It is obviously important that evaluations be 
undei^taken by persons who are not deeply conunitted to or 
involved with the program being evaluated because their 
special interests and deep connections are likely to 
blind them from seeing the program's inadequacies and 
weaknesses* But it is r^lso true that the distance and 
dispassion of an exte^rnal observer do not necessarily 
lead to objectivity* Distance and dispassion can also 
lead to disengagement from what is going on, a lack of 
identification with and empathy for those who deliver 
program services and those who receive them, or even 
worse, an alienation from and disregard for this 
objectives and values held by them* Good eva].uators must 
balance precariously between an intimate and responsible 
knowledge of the program and a distance from it that will 
permit them to see its strengths and weaknesses* 

The evaluation process is further complicated by« 
having many diverse audiences that may be eager to know 
about the impact and effects of the programs being 
evaluated* Each audience tends to have its own needs for 
and expectations about information* With various agendas 
and levels of sophistication, such diverse audiences make 
a variety of demands on evaltiators, sometimes 
contradictory ones* 
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For education programs ^ Congress and the Department of 
Education constitute two highly visible and crucial 
audiences* They are crucial for two reasons s first , 
they can maKe the decisions about which program to 
initiate or to expand, which to discontinue or to 
contract; second, they fund evaluations* Although the 
scope and responsibilities of the Congress and the 
Department of Education are clearly the, broadest, they 
are not the only audiences to whom evaluators of 
education programs must address their findings. Program 
decisions about education in the United States (even of 
federally supported education programs) are only partly 
made at the federal level i thousands of school boards in 
local communities make most of the school policy that 
affects the specific character of public education* 
State education agencies (SBAs) also affect what is 
taught and how it is taught in each of the 50 states. 
These local and state school authorities may be able to 
use information provided by evaluations if the findings 
are presented in ways that are relevant and 
understandable* Indeed, not enough careful thought and 
attention has been given to the problem of how such 
information can be provided in the most understandable 
and relevant ways* 

Perhaps the greatest impact of evaluations is on those 
who manage education programs and those who provide the 
services of the programs* They are the people whose worK 
is being judged* These audiences have the most direct 
involvement in the programs, are most likely to be 
threatened by the evaluation process, and may be very 
fearful that programs will be curtailed or cut off 
because of an evaluation's findings* Program personnel 
are, understandably, usually more concerned with the 
protection of their own programs and projects than they 
are with the advancement of knowledge* Their political 
power can b# and has been exercised to save a program 
that appears to be threatened (for example. Head Start, 
impact Aid)* Often, a negative evaluation finding for a 
national program appears unjust to local program 
personnel, who believe that their projects may be better 
than the average, and offers little help to committed 
staff who wish to make improvements* Nevertheless, some 
forms of knowledge frpm evaluation can be useful to 
program personnel, to teachers and administrators, for 
example, who want practice'-oriented information that may 
help them provide more effective instruction* 
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The consumers of the services provided by a 
program — parents and their children — also have a stake in 
evaluation, although rarely have national evaluations 
been addressed to this audience. This audience is often 
the roost elusive of all because it is not always 
articulate or well organized. When the consumer audience 
has been organized, it has usually been in favor of 
saving a prog.ram despite appparently negative findings, 
probably in the belief that it is better to have a 
program, even if its effects cannot be proved, than to 
have no program at all. Yet it is not clear whether 
consumers have more of a stake in the continuation of a 
program, regardless of its success, or in the continual 
improvement of education through development and 
evaluation of program alternatives. We believe that the 
consumers of education programs have been the most 
neglected of all- potential audiences, although we 
recognize that to develop this potential audience into an 
actual one will require much experimentation with 
alternative modes of communication. 

To further complicate the picture, there are other 
overlapping constituencies and special interest groups 
that are concerned about evaluation processes and 
findings. These groups often reflect minority 
perspectives that they feel have been neglected or 
ignored by traditional evaluation designs and outcome 
measyres. They argue for the inclusion of their 
perspectives in the goals, methods, analyses, findings, 
and recommendations of evaluations. The National Urban 
League, for example, which has its own sophisticated 
research department, has been interested in the 
evaluations of special programs designed to increase the 
reading scores of inner-city, minority children, it 
carefully monitors the programs (value assumptions as 
well as'^instructional methods) as well as the evaluation 
strategies, the data, and the language and style in which 
findings are presented. The National Organization of 
Women and other feminist groups carry out similar 
monitoring of programs and of related evaluations that 
are of concern to them. These special interest groups 
are becoming increasingly visible audiences, and they 
seek to intervene at various points in the evaluation 
process. 

In some sense, an evaluator is expected to provide 
feedback to all of these audiences, an often baffling and 
unrealistic expectation, for each of them has a different 
kind of stake in evaluation, speaks a different language. 
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and has a different conception of usable knowledge. This 
report argues that all of these audiences are important, 
but that any particular evaluation usually should not try 
to be responsive to all of them. Responding to the ' 
myriad and often conflicting expectations of all the 
audiences is likely to diminish the integrity of an 
evaluaton and limit its usefulness to any one audience 
^ The "primary** audience (s) of an evaluation should ""j^e 
identified by those Who call for it and by the evalvatocs 
who carry it outt the design of an evaluation should 
anticipate the primary audience (s), and the procedures , 
T methods y analysis y and the language of its reports should 
correspond to the needs and expectations of the primary 
audience (s)» This does not mean that the findings of an 
evaluation will be useless or wholly irrelevant to the 
•secondary" audiences, but it is likely that there will 
have to be some amount of translation and 
reinterpretation to make the information useful to them. 
Defining the audience and targeting the message will 
reduce the frustration that often accompanies the more 
eclectic attempts to speak simultaneously with many 
tongues ^o many groups. Inevitably the'selection of the 
primary audience(s) becomes a controversial process, one 
that^must be endured, coped with, and responded to by the 
evaluator^ In the case of evaluations that are mandated 
by Congress or commissioned by the Department, the 
mandate should Include some designation of the primary 
audience (s) to which the evaliw^tf v'> is addressed, as a 
guide to J:he evaluators. 

The evaluation process is necei^^iii'iiy a controversial 
one that requires more than technical and procedural 
solutions. Technical matters and procedures are not 
\ unimportant, but there are other important demands that 
\^ must be managed with equal care. Those deijiands include 
resolving the tensions among opposing values and 
perspectives, dealing wit^i political priorities, and 
taking account of contrasting methodological traditions. 
Most of this report focuses on evaluation strategies and 
objectives, issues of quality control, utilization of 
findings, and the organization of evaluation structures. 
Although these technical and substantive questions are 
critical\to those seeking to improve evaluation studies 
in education and increase their usefulness, it is 
important that the evaluation process be seen in context 
and that the reader be cognizant of the myriad forces 
that combine to shape any evaluation. . 

Evaluators must respqcitil to these contextual issues: 
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they must play a role that includes being aware of the 
primary and secondary audiences and of competing 
constituencies, finding the appropriate distance from the 
programs that will permit access and understanding but 
not lead to distortion, and seeking to neutralize their 
place in a highly political environment. Although the 
role of an evaluator is in many respects a responsive 
one, it should not be viewed as essentially reactive. 
Evaluators must do more than negotiate among competing 
interest groups or respond to the various priorities and 
needs for information. Unless they maintain some measure 
of autonomy, they will be useless to all those who call 
on their services. It is critical to be aware of the 
needs of the various interest groups, but a keen 
understanding of audience perspectives should not nnold 
the entire shape of any study, in moving beyond the 
reactive mode, evaluators might well be envisioned as the 
translators and bridge builders among the various spheres 
of research, policy, and practice. Because their work 
requires that th^y be adaptive to several environments, 
they have a unique opportunity to find ways of 
translating and interpreting knowledge and understandings 
from one environment to the other. 



THE VARIETIES OF EVALUATION 

A decade ago, social scientists carrying out evaluations 
tended to concentrate oh providing estimates of the 
relative effectiveness of programs. As experience 
accumulated, however, it became increasingly clear that 
more knowledge was also needed in designing, improving, 
and implementing programs. Hencer the scope of 
evaluation has been enlarged to include research in 
support of policy formulation and program development. 
The diversity of research activities being carried out 
under the general term "evaluation" has led to some 
misunderstandings, especially between evaluators and 
policy makers. On occasion, policy makers have used 
"evaluation" to mean research of a particular sort, while 
evaluators have interpreted .^valuation" to mean a 
completely different type of research. 

In an-effort to improve the terminology employed in 
evaluation activities and to make the terms used more 
specific in their meanings, we outline in Figure 1 the 
various uses of social science research in support of the 
design, implementation, and assessment of social 
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Quettioni Arising During the Formation of Policy and the Design of Programs 



Policy 
Question 

A. How big is the problem 
and where is it located? 



B. Can we do anythirig about 
the problem? a 



Evaluation/Social 
Research Procedure 

Needs assessment 



Basic research 



C. Will a proposed program work SmaM scale testing 
under optimal conditions? 



O. Can a program be made to 
work in the field? 



E. Will a proposed prc^ram 
be efficient? 



Field evaluation 



Policy analysis 



Research Methods 
Used 

Assembly of archived data 

(Census, NCES, etc.) 
Special sample surveys 
Ethnographic studies 

Assembly of archived research 
studies 

Specially commissioned research 

Randomized controlled 

e)(periments 
Pilot studies and demonstrations 

Ethnographic studies 
Randomized experiments 
Field tests and demonstrations 

Simulation 

Prospective cost effectiveness 
studies 

Prospective cost^benefitanalyses 



Questions Arising for Enacted and Implemented Programs 



Policy 
Question 

A. Are funds being used 
properly? 

B. Is the program reaching the 
beneficiaries? 



C. Is the program implemented 
as intended? 



O. Is the program effective? 



E. Is the program efficient? 



Evaluation/Social 
Research Procedure 



Research Methods 
Used 



Fiscal accountability Fiscal records 

Auditing and accounting studies 



Coverage 
accountability 



Implementation 
accountability 



Impact assessment 



Economic analyses 



Administrative records 
Beneficiary studies 
Sample surveys 

Administrative records ' 
Special surveys of programs 
Ethnographic studies 

Randomized experiments 
Statistical modelling 
Time series studies 

Cost effectiveness studies 
Cost-benefit analyses 



FIGURE 1 Policy questions and corresponding 
evaluation procedures. 
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programs. The remainder of this report draws upon the 
terminology established in Figure 1. Both the figure and 
the discussion below project a diegree of linearity 
asiBociated with policy formulation and program management 
that is obviously at odds with reality: programs are 
more frequently than not enacted before systematic needs 
assessment and program testing have taken place; after a 
program is implemented, some monitoring questions are 
asked too early, others not at all; changes are made in a 
program before there is evidence about it, let alone 
evidence on the likely effects of the changes. Our 
discussion of the different types of evaluation questions 
as applied to education programs is sequential in order 
to simplify mapping the terrain, not to indicate the 
order usually followed — or necessarily appropriate in 
every instance. 



Evaluations for Planning Programs 

We draw a basic distinction between evaluation questions 
that arise during the planning of programs and those that 
arise after a program is operating. The first half of 
Figure 1 shows the evaluation questions that usually 
arise durihg the planning of a program, along with the 
social science research procedures that are generally" 
employed to provide answers to those questions. 

Needs Assessment 

Logically, the first question shown in Figure 1 should be 
asked at the outset of discussions about policy. An 
educational problem has been identified, but questions 
may arise about the size of the problem and where it is 
concentrated. Thus, illiteracy may be identified as a 
problem, but there may be little information on how many 
illiterates there are in the nation or whether there are 
a disproportionate number among some age groups, ethnic 
t groups, or regions of the country. The social science 
research designed to answer such questions has come to be 
called needs assessment. 

The research /effort involved in providing answers to 
. the needs assessment question can be as inexpensive as 
copying relevant information from published reports from 
the U.S. census or as expensive as several years' effort 
involving the design, fielding, and analysis of a 
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Ijirge^Bcade sample surveyi 'such as the study by Coleman 
et al. (1966) on equal educational opportunity. Needs 
asaeasments do not have to be undertaken solely with 
quantitative techniques. Ethnographic research may also 
be instructive, especially in getting detailed knowledge 
of the specific nature of the needs in question; is 
likely to be'^ especially effective in determining the 
nature of a need and understanding the processes involved 
in the generation of a problem. Formal quantitative 
procedures, however, are essential when the extent of the 
need has to be established. Obtaining accurate, 
up-to-date data on the size and distribution of 'a 
problem, such as illiteracy, is an important first step 
in planning. Assessment of need and of the contexts in 
which the need is prevalent will help define the 
problem. Needs assessment will also help determine the 
size of a program and attendant costs, at least in part. 

Basic Research--- Choice of Intervention 

The second question concerns whether anything can be done 
about the problem, and if so, what intervention appears 
the most promising. Answers to this question depend 
largely on how much is understood about the problem and 
what policy-related factors can be changed to affect it. 
Basic research is the activity that provides the answers 
to this question. Hence, long-range support for basic 
research on educational. processes is critical for the 
development of the fundamental ideas for education 
programs.^ For example, it is necessary to know why there 
is a connection between socioeconomic level and the rate 
of learning of basic skills by children in order to 
properly design programs to improve the learning rates 
among children from the lower socioeconomic levels. It 
is also necessary to know how much such learning rates 
could be improved by changing teaching methods, by 
lengthening the school day, or by any other policy 
measure that could be translated into a program. Even 
when the ideas for such interventions come from seemingly 
successful exemplary practice rather than from 
fundamental tl^eory, basic research is necessary to 
establish the causal connections between the 
interventions and the learning effects in order to 
identify the critical components that make the practice 
8ucces8ful and, hence, replicable. 
\ At the time that one is looking for proposed 
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interventions to ameliorate an educational problem, 
commissioned ceview papers may be an easy way to bring 
together relevant existing findings from basic research 
since the diverse technical literature dealing with 
educational processes is often difficult to master. 
However, basic research often does not address suitable 
policy variables because basic research is concerned with 
the total causal system as it creates a problem, while 
the variables that can be changed by policy may be only a 
small part of the system. For example, studies of 
children who are disciplinary problems in school may 
stress understanding the links between the family 
situations of the children and their behavior. But for 
policy and programmatic purposes, it would have been 
considerably more useful if there were studies of how 
disciplinary systems within schools affect the rates at 
which disciplinary problems appeared within schools. 
General research consciously linked to the role that 
schools and the educational system generally play in 
learning and other behavior may be the best answer to 
policy needs. Such research may take a variety of forms, 
ranging all the way from systematic observational studies 
of school children to carefully controlled randomized 
experiments that systematically vary the policy-relevant 
experiences of children. Without slighting basic 
research support, it should be emphasized that such 
policy-relevant general research needs special grant and 
contract research programs with review personnel that are 
familiar with what is relevant to policy. 



Small-scale Testing — Program Development 

Given a promising intervention, the question that next 
arises is whether a specific program design will work. 
Pilot testing of proposed programs through experiments 
and demonstrations can often lead to better information 
on whether and how such programs might work. Thus, the 
contract-learning experiments funded by the Office of 
Economic Opportunity in the early 1970s showed that, 
while some contractors could provide effective learning 
experiences, the program aroused considerable opposition 
among teachers and school systems and hence would not be 
a successful program if the program mandated the use of 
outside contractors (Gramlich and Koshel 1975). 

We advocate the use of randomized controlled 
experiments at this stage in the development of a program 



45 



because they are powerful. But because they are also 
expensive, the scale should be relatively modest. The 
great virtue of randomized controlled experiments is that 
they eliminate the possibility that effects may be caused 
by processes other than the intervention! hence, they 
give a potentially useful program the most valid test. 
Moreover, program administration can be controlled to 
ensure that the intervention takes place as intended. 
Under such conditions, a program has the maximum chance 
of working! if it is not effective When carried out 
under controlled conditions by dedicated researchers, 
there ^s no reason to believe that it will work under any 
conditions. However, a commitment to randomized 
experiments for testing programs should not minimize the 
complementary potential of ethnographic studies at this 
stage, particularly to document why a particular 
intervention succeeds or fails. 



Field Evaluation — Program Delivery 

Even if small-scale testing demonstrates a program's 
effectiveness, it should often be changed before being 
widely adopted. The relevant question is how properly to 
adapt a proposed program so that it will be effective 
when it is no longer under the control of researchers or 
specially trained personnel. Unless the prograun can be 
made to work in school systems and in the hands of their 
personnel (or other intended service deliverers), it will 
not alleviate the problem it is supposed to address, no 
matter how effective it was in the experimental setting 
(Rossi 1979a). A process of mutual adaptation often 
takes place (Berman and McLaughlin 1975-78) that changes 
the program as carried out in a given site as much as the 
site is changed by the program. Changes that are likely 
to be made by the people and institutions that will be 
responsible for program delivery must be understood and 
built into the program in such a way that effectiveness 
is maintained or even enhanced. Fielji evaluation* 
(sometimes called formative evaluation) uncovers the ways 
in which programs can be changed so that they will work 
well within existing educational settings. 
Unfortunately, such field testing has not been undertaken 
in a systematic way for many education programs, although 
it has been done in other social service fields: the 
national supported-work demonstration (see Manpower 
Demonstration Research Corporation 1979, Maynard et al. 
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1979) tested a program of tranaltlonal, subsidized work 
experience for people with long-atanding employment 
problems; the youth entitlement demonstration (see Diaz 
et al, 1980) tested the notion of linking a job guarantee 
to school attendance and performance. 

Randomized controlled experiments are again an 
extremely powerful tool at this stage; optimally, they 
should be used to compare several alternative modes of 
delivery. They should be accompanied by process research 
activities that use sensitive and observant researchers 
in close contact with field testing sites. Ethnographic 
accounts can be extremely useful in understanding why 
programs do or do not work as anticipated, how the 
specifics vary from site to site, and what processes 
impede or facilitate implementation. 



Policy Analysis — Program Efficiency 

Finally there is the issue of whether a program will be 
efficient, a question that is answered through 
prospective policy analysis. Here the issue is how much 
the program will cost, how much service will be delivered 
at what level of cost, and whether the anticipated costs 
of the proposed program overshadow the anticipated 
benefits. Simulation and prospective analysis, using 
data from small-scale tests and from field evaluations, 
are inexpensive and ought to be performed before a 
program is enacted into law or widely adopted. 



Evaluations of Existing Programs 

The second half of Figure 1 shows the evaluation 
questions that arise after a program has been enacted and 
is in operation. 



Fiscal Accountability 

Studies of fiscal accountability are perhaps best 
understood by all since they are part and parcel of the 
long tradition of auditing the books of public agencies. 
Procedures are well established and hence much less 
problematical than those for other types of evaluation 
activities. In federal education programs, often the 
only fiscal information comes from grantees' reports on 
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the use o€ federal dollars) usually only the large 
programs are audited by federal auditors. Fiscal audita 
tend to overlap with other forms of evaluation when 
questions are alao asked about how the money was used 
(not just whether it is accounted for). Since 
conventional accounting categories are generally not 
sufficiently sensitive to determine the level of services 
being delivered^ the fact that funds appear to be 
appropriately spent in an accounting sense does not 
necessarily mean that program provisions are being 
carried out as intended. Fiscal accounts cannot 
establish program integrity* nor can such accounting 
establish the true cost of programs # since it does not 
consider hidden or opportunity costs. 



Coverage Accountability 

A significant substantive issue is whether a program is 
reaching the population that is intended to receive its 
benefits. It should be noted that this issue often turns 
out to be of considerable importance: not infrequently* 
progreuns do not reach their intended beneficiaries or 
they reach persons who were not intended to be 
covered — as was the case for Title VII bilingual 
education progreuns (Danoff 1978) and for the television 
program "Sesame Street" (Cook et al» 1975) — or both. 
Studies designed to measure coverage are similar in 
principle to those discussed under "Needs Assessment" 
above. An important source of data for this kind of 
evaluation is a program's administrative records, -which 
often help to identify overcoverage where this is a 
problem, undercoverager however » may often involve 
special surveys. 



Implementation Accountability 

Questions about how a program is being implemented ^mtail 
studying whether and how intended educational services 
are being provided. There are many ways in which a 
program can be less effective in the field than 
expected. Local program personnel may not be properly 
instructed in how to administer the program because 
school and teaching staff may not have Received needed 
in-service training. Regulations may be unnecessarily 
confusing. The local context may militate against 
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administering the program as intsndedr perhaps because 
reRouroes presumed to be praaent may not be# Funds 
intended for a particular program may be used to 
substitute for funds formerly furnished by othsr 
sources. Programs that require institutions to apply for 
grants to extend benefits to the target population may 
not bs presented in attractive enough terms to achieve 
adequate participation rates. Afl a result, fine-tuning 
of basic.legislation or of administrative regulations may 
be required. 

This kind of evaluation is sometimes also labeled 
process research, because ths questions being asked 
concern the nature of a progcam as it is actually being 
delivered and experienced at the particular sites and by 
the persons involved there. Such evaluation may be 
relatively simple or may involve measurement problems of 
considerable complexity. Thus it may be very easy to 
learn from schools how many hours per week their new 
ccmputer terminals are being used, but very difficult to 
learn what precisely is going on inside a classroom when 
teachers attempt to use a new teaching method, when 
classroom organization is changed, or when other services 
are introduced that are highly .dependent on persons for 
delivery. Studies that require direct observation and 
measurement of classroom activity may turn out to be very 
expensive to carry out on a large scale. However, for 
purposes of fine-tuning a program, it may not be 
necessary to proceed on a large scale: it may not matter 
whether a particular problem in implementing a program 
occurs frequently or infrequently, since if it occurs at 
all it is not desirable. Hence, small-scale qualitative 
observational studies may be most fruitful. 



Impact Assesscaent 

Is a program effective? To answer this question is a 
task that requires the highest level of social science 
research skills. The essential issue is whether a 
program produces more of an intended effect than would 
have occurred without the program. While the question 
may appear to be simple, impact assessment is extremely 
difficult to carry out well; It entails both the 
statement of some measurable goals and the determination 
of what would have happened without the program. Each 
step is difficult. Negative effects must also be looked 
for. Even when measurable goals are agreed to and the 
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dlfferenQes made by the program can be determined , 
distinguishing between auccess and failure Is not a 
clear-cut declolon; there are usually degrees of auccese 
or of failure. A program Intended to Improve reading 
that aucceeds In raising students' average reading level 
by a half-year more than expected (In the absence of the 
program) Is less auccessful than one that has 
effectiveness estimates of a full year.^ This 
quantitative difference has to be translated Into a 
qualitative difference when the decision to fund one 
rather than the other program comea Into question. 

The critical effectiveness Issue Is whether a program 
does anything for Its beneficiaries to help them advance 
towards the goals of the program. While It Is relatively 
easy to measure the status of beneficiaries at any time, 
the difficult problem Is to determine what their sta'tus 
might have been had they not participated In the 
program. .An Ideal solution to this problem Is the 
randomized controlled experiment, which ensures that the 
people within the experiment who participate In a program 
are "Identical" to the people In control groups who do 
not participate In the program. Randomized controlled 
experiments, however, are usually not feasible for 
studying programs that have been In operation for some 
time, since It Is ordinarily not possible to find 
appropriate Individuals who have not been exposed to the 
program to assign to control and experlmential groups. As 
suggested above, such experiments are most appropriate In 
the program development phase* For ongoing programs, 
other techniques must be employed, such as comparing 
participants before and after a program has been enacted 
or comparing beneficiaries to those who do not receive a 
program's benefits, such research and statistical 
techniques require extreme care; a large literature that 
Is devoted to them warns of the many pitfalls In their 
use. 

Policy makers should call for Impact assessment only 
when circumstances' warrant such studies (see below). 
They ahould be wary of requiring impact assessment from 
agencies that cannot marshall the necessary skilled 
personnel. They should be equally wary of requiring 
Impact assesament, which la expenalve to do adequately, 
without providing sufficient funds. In particular, only 
a few local and state education authorities have the 
capabllltlea or resources to comt>etently carry out. Impact 
aaseasments; hence, such tasks ahould not be Imposed on 
all state and local agencies without special attention to 
providing sufficient resources. 
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Boonomic Bffioienoy 

The final question in the second half of Figure 1 asks 
whether the coats of ths program are justified by the 
gains achieved. The same question might be raised in a 
comparative framework, that is, whether program X is more 
efficient than program V in achieving some particular 
goal. While these questions also arise during the 
planning phase of program development (see above) at 
this point in the process the answers are no longer 
anticipated costs and benefits but actual costs and 
benefits baaed on goo4 estimates of effectiveness and 
field experiences with the programs. 

The main problem in answering such questions centers 
around establishing a yardstick for such an assessment, 
for example, dollars spent for units of achievement 
gained, for number of students covered, or for ci^asses or 
schools In the program. The simplest way of answering 
questions of efficiency Is to calculate 
cost-effectiveness measures, for example, dollars spent 
per unit of outpUt. In the case of the "Sesame Street" 
program, several cost-effectiveness measures were 
computed, such as dollars spent per child-hour of viewing 
and dollars spent per additional letter of the alphabet 
learned (Ball and Bogatz 1970, Bogatz and Ball 1971) . 
(Note that' the second measure implies knowing the 
effectiveness of the program, as established by an impact 
assessment.) ^ The most complicated mode of answering the 
efficiency question Is to conduct a full-fledged 
cost-benefit .analysis in which all the costs and benefits 
are computed. Relatively few full-fledged cost-benefit 
analyses have been made of social programs because It is 
difficult t6 measure all the costs and all the benefits 
in the same terms. In principle. It Is possible to 
convert into dollars all the costs and benefits of a 
program; In practice, however. It Is rarely possible to 
dp so without some disagreement on the valuation placed, 
say, on learning an additional letter of the alphabet. 



WHETHER TO EVALUATE 

Implicit In the preceding discussion Is the assumption 
that a program, prospective or enacted, can be evaluated 
in some way or another; however, that is not always 
true. There are some programs, whose characteristics are 
described below, that cannot be fully evaluated or that 
cannot be evaluated at all. 
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All programa that havo bean anaoted oan ba avaluatad 
in tha aanaa of flaaal aoaountabllity* prooaduraa that 
hava baan datallad in lawa or in cagulationa can alao be 
avaluatad as to whether they are being carried out aa 
intended. 'But only programa that specify clearly the 
intended beneficiaries and the intended effects can be 
evaluated fully. This is not to say that programa with 
vaguely stated aims are not worthwhile; it is to say that 
they cannot be evaluated as to th^ir effectiveness. 
Thus, a program that has the announced intention of 
enriching the cultural lives of high school students 
Cannot be evaluated with respect to its Impact because 
the aim of "enriching the cultural life** is simply not 
specific enough to provide criteri"^ for judging 
effectiveness. In additionr the group of intended 
beneficiaries, high s^shool students, is so broad and 
inclusive that one simply could not measure ''effects'* for 
all of them. 

A prime' requisite 'for being able to evaluate the 
impact of a program is the existence of clearly 
designated, st>ecific aims. But, as Wholey et al. 
(1975s89) notes 

As a natural result of tK|» political process, 
federal programs usually have many poorly' defined ' 
objectives. Authorizing legislation and program 
guidelines are generally vague about program 
objectives and priorities. . . . Policy-makers and 
managers often perceive that, ambiguity about what 
constitutes success is an asset, permitting 
flexibility and helping ensure survival. 

This situation often puts evaluators in the position of 
setting goals or selecting anuDng several stated goals. A 
program may have a number of diverse goals: for example. 
Head Start was intended to provide better health care and 
nutrition for poor children, improve their cognitive 
development, increase their social competence, improve 
the conditions of participating families and communities, 
serve as a focus for political action and community 
Organization, and result in more effective functioning of 
other service agencies. (See, for example. Office of 
Child Development 1973.) in such cases, evaluators and 
those who commission evaluations must agree on which of 
the goals are most important to assess and whether they 
are sufficiently specific to permit an impact 
evaluation. Often, however, the problem of goal 
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aeloQtlon la governed by the law of Inatrumentei aa the 
early evaluations of Head Start damonetrate, thoee goala 
for which meaaurement Inatrumenta exiat'-'-for exampler 
cognitive aohievement-*-will be the goals by which a 
program is evaluatedf even though other goals may be 
equally important. 

Some programa allow each local school ayatem to aet 
its own goala within broad program aims and to deaign Its 
own Interventions I provided money and services go to the 
target population. For such a program, it Is possible to 
evaluate the impact of individual local projects but 
nearly impossible to gauge the effectiveness of the 
overall program by aggregating effeqta over many sites* 
A similar problem exists for programs that provide funds 
or other assistance to local school authorities without 
specifying more than very general goals. These, too^ 
cannot be evaluated for impact at the national level 
because there is, in fact, no national program but a 
collection of diverse local programs. For example, Title 
I of &SEA is Intended to f^xpand and improve education 
programs for educationally deprived children but It does 
not specify In any detail what Is to be accomplished. 
Therefore, it cannot be evaluated nationally (except in 
the accounting sense) , though projects at Individual 
sites can be evaluated if goals and interventions are* 
sufficiently specific.^ Indeed, programs like Head 
Start and Title I have never been successfully evaluated 
^ for national Impact no matter, how massive the study 
without heroic assumptions concerning their intended 
alms, assumptions that then created considerable 
controversy when evaluation findings were released. 
Results from Individual local studies may cumulate as a 
program matures, however, and should be synthesized to 
permit general conclusions. 

This criterion of specificity in alms also applies to 
prospective programs. If such programs do not have 
specific alms, they cannot be developed properly using 
social science evaluation unless sponsors are content to 
let evaluators specify program goals and Intended 
outcomes, experiments and demonstrations cannot be 
properly designed without knowing what the criteria, for 
effectiveness are to be; cost-benefit analyses cannot be 
made without knowing what the anticipated benefits are; 
and so on. 

Techniques have been developed (Wholey 1979, Schmidt 
et al. 1979) to determine whether a program can be 
evaluated (in the senses discussed above), i.e., whether 
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it !• tvAluabUi Natnbari oe Con^rtMii «nd otehtr dttcialon 
mAktri may want to aonunlaalon auoh atudita ot 
fvaiuablllty a« a Clrat stap in avaluation rathar than to 
aaauma that all pcograma can ba avaluatad. Inc3«ad, wa 
oonvnand tha Dapartment Cor ahiCting aoma ot ita 
avaXuation caaoucaaa in thia diraationi ao far, 10 
avaluability atudiaa hava baan aonuniaaionaO by tha 
Qantral avaluation unit ot tha Dapactmant. 



WHEN TO EVALUATE 

Evan if a program ia suClfiaiantly apaaiCiad to allow both 
accountability and impaot avaluationa^ oonduoting iropaot 
avaluationa may be inappropriate at a particular time 
bacauae of tha ataga of program development or 
implementation. There are three phaaes In the life of « 
program that are notably inappropriate for impaot 
avaluationa. The firat la during the program's, 
development* We have suggested that a proposed program 
be tried out under actual field conditions after it has 
been proved to be effective in a controlled experimental 
setting. The purpose of thia phase is to adapt the 
program so that it will be maximally effective under 
normal operating conditions, Obviously, impact (or 
summative) evaluation is totally inappropriate during 
this phase; at this point, evaluation should be used as a 
tool to fine-tune the program, not to judge it. 

The second phase Is after a program has been enacted 
and is being put into operation. All programs require a 
shakedown period, during' which program administrators 
develop regulations and operational procedures and 
teachers and school personnel (or other service 
deliverers) become familiar with the program's objectives 
and methods. The more complex a program, the greater tM 
start-up problems, when a program allows flexibility and 
local choice, further time must be permitted for local 
decision making and development of specific features. 
Until a program has stabilized, it ought not to be 
evaluted, except for fiscal accountability. Too many 
negative findings have, in the past, been due to 
premature impact evaluation. Even accountability 
evaluations may be inapproprliate in the early 
implementation stage, as demonstrated by findings on weak 
administration and even misuse of Title I funds in the 
first studies of the program, findings that did not hold 
up once personnel at the state and local levels had 
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l^iirnod how to op«r«ta the prrogr^m (Klrat and vlung 
1900), th« Title t at;Mtll«« «Uo t1ainonatr«t« another 
pQlnti If fnor© aCfeotiva policy s^nnlyeU were oonduataa 
btifora impUmonting a program to ensure that pcoqrAm 
U(jlslAtlon <inti cagulationa did not la«d to aonfualon In 
tha Maid, the ahftKadown period might be ponaldarably 
CQduQad, 

Tha third phaae during whloh Impact avaluatlona are 
Inappropriate Involves aduoatlon programa that have 
long-rango aa well aa ahort-range objeotlvea, For 
«Hample, oarear education may be aonoerned with helping 
youth achieve both ontry-lavel akllla and aatlaCaotory 
career patha, Obvloualy, the aeoond objective la not 
moaaurAble until a££ecta emerge aCtar a number ofi yearn. 
Aaaeaamont ofi auoh efCoote requlroa tlme-aerlea atudlea, 
which take long-range commitment or aophlatloated 
atatlBtlcal modeling that roqulrea highly aklllod 
rQBoarohera. Too often » Impact evaluatlona have either 
Ignored long-range effecta aa too coatly and 
tlme-consumlng to aaseaa^ or they ^have attempted 
aaaosament of long-range effecta in an unrealistic time 
frame. Aa a result, the full effecta of the program 
* remain unknown, even though evaluation is said to have 
taken place. If programs are to be judged by their 
reaulta, enough time'"must be allowed for the programme 
full effects to emerge before full-scale impact 
evaluation can be done. 

One final point about the timing of evaluations 
concerns old programs. There is a need to address policy 
issues in programs that have been operating so long as to 
become routinized. How have conditions changed? Are 
there different educational goals? Have the needs of 
intended beneficiaries changed? Periodic ^valuations may 
provide needed "shake-up" to ensure that a program is 
still meeting priority objectives. 

Recommendation C-1. When Congress requests evaluations 
it" should identify the kind of question (s) to be 
addressed . 

At present, there is a multiplicity of requirements 
for evaluation that vary from title to title (see Boruch, 
Cordray, and pion, Ch. 3 in Boruch and Cordray 1980). in 
some Cases, Congress calls for elaborate and detailed 
evaluation studies involving sophisticated quantitative 
techniques and analyses; in others # requests are made for 
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imprtsilonlitlo And «n«Qdotail i?«poru, Conqr^ao n«i«id« to 
b« mor« «yit«m«t;lQ in iU Approftoh «v«lwiit;ion. 
InstiAd Qt ip«Qieyln9 in«thad», Conqr^i^e shQMld mAKi» qvuo 
th«t tvAluAtora a(« Ql««r About tht qu«Ations to 
Anawnrtdi 

figure I Abovs idtntieitA 10 Kinds oC ©vAluAtion 
AOtivltlAA. At UAAt pArt o( th«» QhArgA thAt ©vAlUAtione 
hAVA bAAn ircAlAVAnt to Congrf»A»« nAAdA (or inJopmAtiqn 
AtAmA Crom thA fAot thAt CongrAAa hAA o^tAn bAAn 
IntAvprAtAd to bA oAUlng Cor impAot AVAluAtion whAn in 
fAOt it dAAirAd only to know, aayi how waII a progrAm waa 
mAAting itA govArAgA rAquirAinAntA. A oaII Cor AVAiuAtion 
thAt doAA not ApAoiCy whAt quAAtlonA ArA bAing AAKod OAn 
lAAd to thA miAiDAtohing of AxpAotAtion And pArCormAnoA by 
CongrAAA And thA AVAluAtorA, whilA lAglAlAtorA might 
inoludA thA policy quAAtlonA to bA AddrAAAAd dirAOtiy in 
thA lAgiAlAtlvA pcoviAiooA for AVAluAtion oC A progrAm, 
it mAy not Aiwaya bA poAAlblA to CrAntA quAAtionA with 
AwfCioiAnt ApAOiflolty At thA tlmA AVAlUAtion proviaiona 
ArA bAing AnAOtAdi AApAOiAlly Cor nAw pcogrAma. in auoh 
QAAAAf AuCCioiAnt diAloguA Ahould tAkA plAOA bAtwAOH the 
lAgiAlAtorA And thA ImplAinAnting AgAnoy And tho 
AVAluAtora to AnAUrA thAt the AVAluAtion will meet its 
intAndAd obJaotivA (BarrymAn and OlAnnAn 1980). 

CongrAaaional inAndatAS Cor . AVAluation ahould alao 
idAntiCy the audiAnoA that ia to be aerved by the 
legialated evaluations Congreaa. beneCioiaries auoh as 
parent or other intereat groupa, local program 
adminiatratoea, federal program adminiatratorai and the 
like. The reaaona for apeoifying audienoea in any 
evaluation are diaouased in greater detail in later 
ohAptera, The reaaon for including audience 
apeoif ioation in thia recommendation ia that auch 
apeoif ioation will alao aharpen the policy queations 
becauae different audienoea tend to have different 
information needa. 

Though we recommend that it be apeoifio with respect 
to question and audience f legialative language Regarding 
evaluation ahould refrain from apeoifying details of 
method (auch aa sampling procedure or u^e of- control 
groupa) or of meaaurement. Theae are mattera requiring 
careful technical conaideration of apecific evaluation 
conditiona and contexta and ahould be choaen only after 
adequate planning and .the application of expert knowledge. 



ThU r«gQmm«nd«tion In Annlogpus bP th« an« ca 
Congrtiif but •mph«»ii«a tht ntdcl to thinK through what 
typ« oe evAlUAtion nativity li Appropirltitt At iiny 9iv«n 
8tAg« Qt dtvtloptnint or ImplcmtntAtion ot • propottd or 
An Axlitlng progrAmi While AVAlUAtlon AOtlvitl^i Arti ol 
gourati AptoiCiAd (n grtAt dAtAll by AVAiUAtion pArAonnAl 
At thA proourAmAnt AtAgAi thlA rAOonunAndAtlon 1a dirAOtAd 
to thA ovArAll AVAlUAtlon plAnnlng AtAQA whAn top-lAVAl 
DAPArtmAnt oCeioUlA nAAd to ApAOlfiy whAt thAy wlAh to 
Know About A progrAm (1»Aii thA policy quAAtlonA)i why 
thAy wlAh to know It At AomA APAolClAd tlrnAf And whAt 
othAr AudlAnoAA hAVA informAtlon nAAdA thAt muAt bA 

AAtlAClAd through AVAlUAtlon AOtlVltlAAi 



RAoomroAndAtlon P"'2« WhAn Pilot tAAtif of prOPoAAd mAjor 
proqrAmA aca oonduotAdi pilot tAAtA ot fVAlm^tion 
cAqulrAmAntA Ahould bA condaotAd Aimult^jnAo^^y to 
dAtAcroinA thAir tAAAibil^ity find A^>PifbPriAtfnAAAt 

OnA ot thA WAloomA prooAdurAl ImprovAinAntA in cAOAnt 
yAAra hAA bAAn thA grAAtAr uae of pilot tAatA of propoAAd 
nAtionAl progrAms* ThA ArguntAnt ia often mAde that pilot 
tAAta And fiAld AVAluAtiona Are coatly and time oonauming 
and that an urgent aooial need cannot remain unaddreaaed 
while the ponderoua prooeaa o^ reaearoh prooeeda. But 
the urge to get programs off the ground without prior 
testing bringa with it oettain and often high ooatst 
programs develop an array of self'-interested suppliers 
and clients who are likely to fight any ohangeSf even 
when subsequent evaluations and researoh indicate that 
they are needed. The Committee endqrses the concept of 
pilot tests since they have the obvious advantage of 
allowing decisions on implementation and on program 
changes to be made before programs become entrenched* 
.Another welcome precedent is thatr nvore and more, 
legislation routinely prescribes^ that programs contain 
their own evaluation*" requirements. Suoh provisions 
y ensure that some sort of evaluation will be made of 
grograma on a continuing basis. 

This recommendation focuses on the intersection of 
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these two developments. While pilot tests of a program 
are being made, it is relatively easy to also conduct a 
pilot test of 'the proposed evaluation. Such a pilot test 
can 4Se used to find out what measurements can and cannot 
be made of program benefits, how programs should account 
for and measure costs r which testing instruments and 
procedures are disruptive and which are not, how large a 
sample of beneficiaries is needed to get valid program 
measurements » and so forth^ If a pilot test of the 
eveluation is carried out in conjunction with the pilot 
test of the program^ the design of both the program and 
of the evaluation requirements will be strengthened. 
Inde«d, if evaluation requirements are not pilot tested, 
it is difficult to see how those charged with evaluation 
responsibilities at the loc«l and state levels are to be 
held accountable. 

STANDARDIZATION OF METHODS AND MEASURES 

As indicated in the preface to this report # one of the 
missions given to the Committee was to make 
recommendations and proposals **. . .to ensure that 
evaluations are based on uniform methods and 
measurements." The Committee's major contribution to 
this goal is to attempt to develop a terminology for the 
various kinds of evaluation activitiesr as discussed 
above, and to match evaluation questions with appropriate 
research approaches. However, we believe that to proceed 
any further with specific recommendations for attaining 
uniform procedures and measurement is a premature step at 
this stage in the development of evaluation. 

At the present time, the science and art of evaluation 
is in a state of considerable change and improvement. 
Each of the social science disciplines^ has made 
contributions to the procedures now used^ and while there 
is some agreement on the rough preference ordering of 
procedures to address a set of policy questions, the 
rapid rate of development along with considerable 
diffusion of m<9thods from one field to another means that 
today's preferenc?es may be superseded by tomorrow's more 
mature understanding of the proper fit between problem 
and met^4^ additi^on, evaluation activities are being 
undetta^iv Ia a variety of substantive areas-- not only in 
eduentian, ^ut in manpower training, energy conservation, 
health Servii^es delivery, child care, jpublic welfare 
payment plans, criminal justice procedures, and so 
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on — and In each of these areas new methods and procedures 
are being developed that can be expected to enrich the 
field of evaluation. 

The Conunittee believes that, while the goal of 
attaining uniformity In evaluation methods and measures 
Is an extremely desirable one# it cannot be attained at 
the present time without prematurely Inhibiting further 
advances In the field of evaluation and stopping It short 
of needed development. The recommendation below that the 
National Institute of Education (NIE) continue and 
strengthen Its program of support for basic research In 
evaluation methods Is made In part to accelerate full 
development of the field of evaluation. 



. Recommendation D-3. The National Institute of Education 
should continue and strengthen its program of support for 
research in evaluation methods and processes . 

The field of evaluation is a relatively new one that 
has made considerable progress in the last 15 years; 
however, it is far from fully developed. It continues to 
apply promising research approaches from all the social 
science disciplines and feed back to them the resulting 
experience. Hence, support of research in evaluation 
methodology not only improves the field of evaluation, 
but enriches the basic disciplines — an effect that is 
also important for fundamental research in education. 

The Committee believes, however, that support for 
development in evaluation has been uneven, in particular, 
that too much attention has been given to investigating 
problems in the use of randomized controlled experiments, 
a procedure that has only limited utility in evaluation 
generally. As a result, other important problems in 
methodology have not received sufficient study. 
Especially important is the development of methods for 
studying the delivery of j services (implementation), for 
investigating the properties of achievement tests when 
used in the evaluation of programs (rather than in 
ranking individuals), and for assessing the impact of 
programs that cannot be idtudied through the usual 
experimental paradigms. 

Another neglected area of research has to do with the 
process of evaluation itself: how studies are 
commissioned and initiated, how they are managed, what 
procedures govern their execution, what legal constraints 
impinge upon them. Evaluation is controlled by at least 
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three different agencies: the sponsor of the evaluation, 
the program or service agency In the field (e.g., a 
school system) , and the evaluators. When the sponsor Is 
a federal agency, there- are three control points within 
v^he agency: the evaluation monitor, the contracts 
o^lce, and the manager of the program being evaluated. 
ThV^complexltles created by these multiple organizational 
rela^onshlps create constraints for any study, and those 
constraints have been given little attention. Our own 
limited findings related to such Issues are reported In 
the next three chapters; those fi^ndlngs make It clear 
that the evaluation process must be better understood If 
it Is to yield good results^ 

The National Institute of Education should encourage 
work In the noted areas of methodology and process as 
part of Its evaluation research program. Furthermore,, 
with rare exceptions, when a specific methodological 
question roust be addressed In a given time frame or the 
process of a specific evaluation Is to be studied, all 
such research should be carried out through a competitive 
grants program that specify the areas of Interest but not 
the approach to be taken. 



NOTES 



1 Success here Is defined In terms of the objectives of 
the program. It is quite possible that a program 
successful with respect to its own objectives may be 
educationally undesirable. For example, perhaps more 
time was spent on a targeted skill and so some other 
Important skill was neglected and hence less developed 
than It would have been In the absence of the 
program. To gauge the overall educational 
contribution of a program, it is necessary to assess 
such negative as well as the positive effects. 

2 A good deal of knowledge that can be applied to 
program Improvement may. In fact, be gained through 
documenting program variations and their effects. A 
panel of the National Research Council's Committee on 
Child Development Research and Public Policy Is 
currently reviewing outcome measurement In early 
childhood demonstration programs. Given that local 
program variation Is encouraged by many early 
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childhood programs, the panel has given considerable 
attention to the need to consider the relationships 
between variations in treatment and outcomes within 
programs and on adaptations in program practice and 
variations in outcomes from site to site. 



Quality of Evaluation 



Knowledge about the quality of evaluation studies in 
education is limited* It comes from three sources t 
technical critiques and reanalyses of specific (usually 
large-scale) studies » a few scattered reviews of some 
samples oC evaluations^ and analyses of the inf ence of 
the political context on the quality of evaluat ons. The 
effects of the managerial context on quality*^ .ow 
evaluations are commissioned and carried out — nas 
received considerably less attention. Yet the level of 
funding # what types of organizations usually perform 
evaluation studies^ and the availability of adequately 
trained individuals all influence the quality of 
evaluations. In addition # procurement procedures can 
encourage or discourage creativity # and 
interorganizational complexities can introduce delays 
« that often have deleterious effects on the course of a 
study. 

There are several dimensions to the issue of quality* 
Evaluations can be competently done but not be very 
creative. They can be imaginatively done but be sloppy 
on some points* The various standards for evaluation 
work recently developed by a number of groups (Joint 
Committee on Standards for Educational Evaluation 1980, 
U.S* General Accounting Office 1978 t 1979, 1980b, 
Evaluation Research Society 1980) may be useful to the 
profession, but since any major evaluation is a 
customized task, they cannot resolve quality issues in 
any specific instance* Furthermore, quality is 
inevitably subjective, especially in an activity sudh as 
evaluation for which facta and values are inextricably 
linked. For these reasons, the Committee's 
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recommendations do not feature rigid requirements. 
Instead, the Committee has chosen to highlight some 
defects that commonly stand in the way of improving the 
competence, creativity, and integrity of evaluation and 
to propose ways of institutionalizing some quality 
control mechanisms. In this chapter, we first review the 
available evidence on the quality of evaluations and on 
the influence of the political context and then analyze 
some of the managerial constraints that affect quality. 
In the last section, we focus on evaluation at the state 
and local levels • 



REVIEW OF THE EVIDENCE 

Critiques of Individual Studies 

Individual studies of evaluations have generally centered 
on evaluations of highly visible programs with strong 
advocates and adversaries. Some prominent examples in 
education include: the reviews of Equality of 
Educational Opportunity (Coleman et al. 1966), which were 
edited by Mosteller and Moynihan (1972); the critiques of 
the Westinghouse-Ohio study of Head Start (Cicirelli and 
Granger 1969), which were initiated by Campbell and 
Erlebacher (1970) and grew so voluminous that the 
critiques themselves have been analyzed and their impact 
assessed (Valentine and Zigler 1979, Datta 1975, 1976); 
the evaluations and reevaluations of **$esame Street** (for 
example. Ball and Bogatz 1970, Bogatz and Ball 1971, Cook 
et al. 1975); the evaluation of the effects of the 
Emergency School Aid Act (ESAA) programs (Crain and York 
1976, National Opinion Research Center 1973), which was 
then the subject of critiques by the National Advisory 
Council on Equality of Educational Opportunity (1975) and 
Acland (1975); and the recent eva].uation of bilingual 
education (Danoff 1978), which has received much 
political as well as some technical criticism from the 
National Institute of Education and others (U.S. Congress 
1977). Both the technical and the political criticisms 
have helped the evaluation field to mature, although the 
debates have at times been acrimonious and appeared to 
confuse rather than illuminate program achievementis and 
conditions. The debates may also have created a degree 
of cynicism about evaluation. Whatever confusion and 
disenchantment the critiques and debates have engendered, 
however, they have served to sensitize evaluators to 
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methodological pitfalls and to the need to consider the 
context in which evaluation takes place. More 
specifiqally , as we noted above, they have given rise to 
several sets of evaluation standards* Unfortunately, the 
total number of studies subjected to open professional 
review has been small, and the absence of such review has 
not necessarily inhibited the use of evaluation 
findings. Datta (1979) analyzes an interesting example 
of a study on the effects of federal education programs 
(Herman and McLaughlin 1975-78) whose summary findings 
were widely accepted and applied in policy formulation 
without questioning when later examination . revealed 
considerable problems with some of the summary 
conclusions and the interpretations they had'been given. 



Reviews of the Field 

Aside from the critiques of some landmark studies, there 
have been few systematic reviews of the quality of 
evaluations, such as assessments of representative 
samples of studies published during a specified time 
period or resulting from the activities of a particular 
sponsor or group of performers. In an early study, 
Bernstein and Freeman (1975) started with 236 studies 
from fiscal 1970, of which they ruled out 84 as not being 
comprehensive, i.e., not measuring both process and 
impact. Using criteria oriented toward quantitative and 
experimental methodology, they found only 27 of the 
remaining 152 studies to be of high quality^ less than 20 
percent; 76, or 50 percent, were deemed to be of low 
quality. Minnesota Research Systems, Inc. (1976) 
examined 110 research studies (about 45 percent of which 
were classified as evaluations) funded by the U.S. 
Department of Health, Education, and Welfare (HEW) and 
completed in 1973 and 1974. Less than 10 percent were 
deemed to be free of significant methodological flaws. 
Moreover, they found that in 90 percent of the cases the 
flaws already existed at the proposal stage. ^ 

The size and the scale of evaluation studies have 
grown considerably since the early 1970s, but problems of 
quality 'lappear to persist. Rossi (1979b) reports on an 
examination, done over 3 years for the Summer Evaluation 
Research Institute at the University of Massachusetts, in 
which several hundred requests' for proposals (RFPs) were 
screened to look for those liJIULy to lead to a sound 
research plan. Using that cr^efr ion r ^ less than a dozen 
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were Identified as being suitable for teaching purposes. 
On the performer side, evaluation researchers who 
screened more than 100 evaluation research reports on 
biahalf of the Rusaell Sage Foundation identified only 
some half dozen that merited special review as examples 
of high quality (Rossi 1979b). Abt (1979), who heads one 
of the major firms engaged in evaluation research, has 
estimated that only 5 to 20 percent of studies in the 
field of evaluation can be considered valid and relevant 
^research. He notes that these numbers might be 
acceptable compared with those for basic research but 
that they are far lower than is the case for other 
applied fields such aa enginec ig or legal research. 

Indirect evidence on the qu^xity of evaluation studies 
comes from a number of attempts, briefly noted in Boruch, 
Cordray, and Pion (Ch. 5 in Boruch and Cordray 1980), to 
identify exemplary programs. Such attempts — for example, 
finding effective programs to increase equity in 
vocational education, programs in bilingual education, 
and programs in career education — usually yielded only a 
small number for which sufficient evidence was available 
to make judgments as to their educational promise. The 
number of projects so identified tended to be less than 
10 percent. Only in the case of the Joint OE/NIE 
Dissemination Review Panel, which judges exemplary 
projects proposed for dissemination, is the rate of 
projects that show adequate data on effectiveness more 
than 50 percent; as Boruch, Cordray, and Pion note (Ch. 
5:7 in Boruch and Cordray 1980), however, this estimate 
is "biased in the direction of higher quality due to 
voluntary submissions'* and the efforts by the panel to 
promulgate its standards for acceptable evidence, which 
were published by the U.S. Department of Health, 
Education, and Welfare (Tallmadge 1977), 

Except for Bernstein and Freeman (1975) and Minnesota 
Research Systems, inc. (1976), these sources of 
information on the quality of evaluation studies do not 
distinguish between studies commissioned at the federal 
level and those commissioned or carried out at the state 
or local levels. A number of the studies commissoned and 
funded by the Office of Education's central evaluation 
unit have been widely recognized for their technical 
proficiency in terms of general standards prevailing in 
the field. The picture at the state and local levels is 
decidedly more mixed, as documented in two studies cited 
by Boruch, Cordray, and Pion (Ch. 5 in Boruch and Cordray 
1980) that considered the quality of evaluations 
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performed at those levels* The first study, by the U.S. 
General Accounting Office (GAO) (1977), surveyed state 
and local officials on how sound and reasonable they 
considered evaluation findings from reports produced by 
state and local agencies. While reports issuing from the 
same level of government were more credible to officials 
(i.e., state officials rated state reports more hi^hly'^ 
local officials rated local reports more highly), even 
the roost favorable ratings considered only two-thirds of 
the reports adequate or better, and in the least 
favorable cases (state views of local Title I reports) , 
barely one-thi|:d were considered to be adequate or 
better. Among other recommendations, the GAO requested 
that the Office of Education review the program 
information collected in local agency evaluation reports 
in order to determine whether such information could be 
aggregated to serve the different needs of federal, 
state, and local governments. 

In the second studyr focused on evaluation carried out 
at the local level, Lyon et al. (1978) reviewed 116 
studies for^ the presence or absence of criteria 
considered to be necessary elements of an evaluation. As 
Boruch, Cordrayi and Pion note (Ch. 5x7 in Boruch and 
Cordray 1980), the Lyon study "suggests that simple 
st^dards are not often adhered to." Holley (Appendix C) 
comments that among the possible reasons are insufficient 
evaluation funds, insufficient control of the funds and 
often of the evaluation activities themselyes by program 
administrators, and lack of training and experience of 
many of the personnel who are assigned evaluation 
responsibilities • 



The Political Context 

One of the sources of disappointment with evaluation is 
that it appears not to have contributed as effectively as 
hoped to the making of decisions about programs. At 
times, this lack has been attributed to the inadequate 
quality of many evaluations. More recently, however, the 
analytic literature dealing with the contributions and 
failures of evaluation has reflected a considerable shift 
regarding the potential for decision making offered by 
pj^ogram evaluation. Such early studies as the 
Westinghouse-Ohio evaluation of Head Start (Cicirelli and 
Granger 1969) were in part condemned for a iiarrow choice 
of outcome measures that did not adequately reflect 



program goals. More recent writing has emphasized the 
diffusenessr multiplicity, and ambiguity of goals in most 
social program legislation. Without specification of 
outcomes that can be measured, program evaluation as 
originally envisaged loses credibility because the 
effects achieved cannot be compared with those intended. 

Researchers do not agree on how to deal with the 
dilemma of program legislation that may be specific on 
process but is vague on intended objectives, yet mandates 
evaluation. Rossi et al. (1979) have suggested that 
program goals should be spelled out specifically enough 
to allow impact assessments; more recently, he and Chen 
(1980) have argued that researchers cannot simply accept 
official goals but must learn how to interpret programs ^ 
and their likely effects more accurately in ^rder to 
design evaluations that are sensitive to program impact. 
Wholey, when he was Deputy Assistant Secretary for 
Evaluation of HEW, introduced the notion of evaluability 
(see Appendix A) whereby short-term, exploratory 
evaluations would determine the operational objectives of 
a program and whether they could be measured (Wholey 
1979); if they could not, costly impact assessment would 
not be commissioned. Cronbach et al . (198Q) argue that 
the quest for specification of goals is futile and that 
evaluation is a prospective activity better suited to 
understanding processes and events for future program 
formulation than for retrospectively appraising the 
performance of programs, against predetermined objectives. 

There is more agreement on the role of the evaluator 
in the decision-making process, namely, that the 
information developed through the processes and by the 
canons of social science is, and should be, only one cf 
the determinants of policy regarding education (or any 
other social) program decisions. Arguments deriving from 
research on how evaluation findings are used (Caplan 
1977, Alkin et al. 1979) have led to recommendationfs that 
evaluations, to be useful, must be done in close 
cooperation with- the intended user and must al;so involve 
a process of negotiation that draws on the views of , " 
beneficiary and constituency groups. However, such a 
process is often counter to the objectivity considered, to 
be a hallmark of quality evaluation. According to 
Schreier (1979), it pits the insider's (e.g., client's, 
teacher's, program manager's) intuitive perception 
against the outsider's concern with quantitative 
assessment. The result is that they are unlikely to 
agree on goals. The focus of evaluation may then shift 
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to good management, the purpose being to Imprpye program 
process rather than to ascertain how well outcomes, which 
remain unspecified, are being met. 

• THE MANAGEMENT CONTEXT 

Over the last decade, evaluation of education programs 
has become big business, and this has had an Impact on 
quality. When the first legislative mandate for 
evaluation was written into law as part of the 19^65 Title 
I (ESEA) legislation, evaluation was considered to be an 
activity carried out at the local level for 
accountability and to improve the program. Every year 
thereafter, local evaluation activities were initiated 
for a number ot programs, usually coordinated by an 
evaluation specialist within the federal program office. 
As Jtlje number of activities grew, concern with quality 
and need for generally applicable procedures led to the 
establishment in fiscal 1970 of a central evaluation unit 
in'OE (see Appendix A). 



Funding 

Before fiscal 1970, 'the Office of Education had about 
$1.25 ifiiilion per year for central evaluation available. 
In that year, for the first time, /there was a separate. . 
line item for evaluation. The peak funding for the 
central evaluation unit, was reached in 1978, with $29.7 
million obligated for evaluation contracts. In 1980, the 
amount had decreased to 319.4 million. The most 
precipitous drop within the unit came in evaluation funds 
for discretionary purposes, i.e., not earmarked for a 
specific title: these funds dropped from $7.1 million in 
1977 and 1978 to $3 million in 1980 (U.S. Department of^ 
Health, Education, and Welfare 1979b). 

According to Reisner's estimate (Appendix A), in 
' fiscal 1980 the Department of Education was planning to 
sp^d some 340 milliqn on a variety of evaluation 
activities, half of the work being carried out by the 
central evaluation unit and nearly a quarter by the 
Inspector General. If one wishes to calculate the total 
amount spent for program evaluation in education, that 
estimate needs to be augmented by the amount spent by the 
General Accounting Office (estimate^' at 32.5 ^million) and 
an unknown amount of federal funds devoted to evaluation 
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activities carried out or commissioned at theXstate or 
local levels. Taking a different approach, Sharp's 
analysis (see Appendix B) is based on performers rather 
than sponsor data, includes policy studies as well as 
evaluation activities i and is for 1977 when there may 
have been a somewhat greater investment in evaluation: 
her best estimate is that a total of $100 million in 
federal funds was spent for evaluatipn in education at 
all levels of government. This amount represents 
something like a fourth or a fifth of all evaluation 
activities funded by the federal government. By far the 
largest growth occurred during the earlier part of the 
decade (see Abramson 1978i Cronbach et^al. 1980, National 
Science Foundation 1979) i during the last few years, 
federal funding for evaluation, at least t^at portion 
visible at the national level, has actually decreased 
somewhat, matching the trend for overall funding for 
education. As a percentage of total federal expenditures 
for education, the current investment , in e^^luation 
represents ^bout 0.5 percent of the total federal support 
fdr education, which stood at'$14.2 billion in fiscal 
1980. 



Performers 

Although expenditures for evaluation may appear modest as 
a percentage of expenditures for education, they are a 
major source of income for private-sector^rformers of 
educational research and development.^ Such performers 
account for nearly half of the total spent for evaluation 
(Appendix B:Table B-4) and are particularly promirtent in 
carrying out medium-scale ($100,000-*$500,00Q) and 
large-scale (more than $500,000) studies 
(Appendix BjTable B-5). Within the private sector, 
for-profit firms repbxit that more than 50 percent of 
their research activities consist of evaluation and 
policy studies (Appendix BxTable B-<3) . By contrast, less 
than 10 percent of academic institutions carry out 
medium- or large-scale studies; some 40 percent report 
doing no evaluation work at all (Appendix B:Table B-5), 

Moreover/ evaluation work is heavily concentrated 
among major^ private-sector performers; they account 
, for 83 Percent of evaluation funds spent in the private 
sector (Appendix B:Table B-8) . They are also more 

\ heavily dependent on federal funding than- any oth^r set 
of institutions (Appendix B, Table B-9). As Sharp notes 

^(Appendix B:219)j 
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Large private-sector organizations and 
organizations which specialize in education RDDfiE 
have especially few other sources of funding: half 
of t^ organizations which expended more than $1 
million in 1975 for education RDD&E received at 

least 75 percent of their"f unds f rom~the~f ederal 

government, and one fourth of them received at 
least 90 percent from this source. 



7^ Evaluation is a relatively new field that is* to a 

significant degree staffed with individuals recruited 

V from other fields. This newness creates a critical 
quality problem at the state and local levels (see 
below) , but important gaps exist throughout the 
evaluation enterprise. Of specific concern are the 
underrepresentation of minority group members in 
educational evaluation r the communication barriers 
between evaluators and administrators, and the failure of 
individuals charged with evaluation responsibilities to 
keep up with developments in the field. 

Toward Equal Educational Oppor^ mity ^ 

In order to further the national commitment to equal 
educational opportunity, nearly 80 percent of federal 
education programs are targeted for ra'cial, ethnic, 
handicapped, and other minority or disadvantaged groups. 
And if federal programs are, to provide more effective 
educational services for these groups*, consistent input 
on their needs must be part of the evaluation - process . 
An examination of social science research over the last 
40 years (Gregg et al. 1979) shows how research questions 
• have changed in those fields—and those' fields only— in 
which the subjects of inquiry have participated actively 
in defining the problems. Though talent and skill remain 
the prime requisites for evaluation personnel, the 
perspective that comes from being a mmber of the 
recipient group augments the evaluation proceds in 
important ways. Thus, one can look at bilingual 
education from the viewpoin.t of society as a whole, of 
the classroom teacher, or of the non'-English-*speaking 
child and family. Women, blacks, arid otheyrjninorittea ' 
have' helped give a different cast to ^<[ucational research 
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that arises out of their perspectives. For this reason, 
the Committee Is concerned that Individuals from these 
groups who couXd contribute to broadening evaluation 
perspectives are not adequately represented in the 
current staffing and procurement of evaluations* 

For example, of the 65 professional staff of the 
central evaluation unit of the Department of Education in 
March 1980, there were 4 black men, 2 black women, ' 
Asian man, and 19 white women. There were no Hlsp ..rc?:^ 
or American Indians on the staff. For another exsnple, 
in the technical assistance centers (TACs) # which have 
been created to aid local projects in conforming to the 
guidelines and standards set for Title I evaluations and 
which presumably should act as models for expanding the 
audience and decentralizing the process, not a single 
director or senior staff person was a minority individual 
as of spring 1980. Of more than 100 evaluation 
professionals at any level in the TACs, there were only 8 
minority persons. Principals in the central evaluation 
unit have consistently expressed a desire to hire more 
ethnic and racial minority persons in key professional 
positions, but, according to them, ,hav6 not been 
successful in finding those with the appropriate 
background and necessary skills. 

As a group, minority-run firms have fared particularly 
badly in the field of evaluation. Despite special 
provisions for 8-A contracting,^ only 15 of 200 new 
contracts awarded by the central unit during fiscal 1976 
through fiscal 1980 went to minority firms, ^ 8 through 
the 8-A process and 7 through the competitive process. 
These 15 evaluation contracts accounted for less than $4 
million of a total of close to $100 million awarded in 
those years, or barely 4 percent of the total, and only 
10 minority firms were Involved. 

The issue is not simply nor even primarily an 
affirmative action one. We presume that both the 
Department of Education and its contractors and grantees 
are complying with the laws regarding eqtial employment 
and affirmative action programs. In fact, it has been 
argued that women and minorities are already represented 
on staffs and in the evaluation enterprise proportionate 
to their percentage in the available talent pool. But 
this is not the only criterion: they are greatly 
u'nder represented compared with their numbers in the 
benefilclary population. The Committee -is not suggesting 
proportionate representation, but we are stressing the 
Importance of this issue in personnel and procurement 
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practices. In our recommendations below, we suggest some 
means for greater involvement of minority firms and 
individuals in performing and reviewing evaluations. Cut 
first recommendation addresses the issue of the talent 
pool, since unless it is expanded minority participation 
in evaluation will continue to remain limited. At the 
same time, the recommendation considers some additional 
gaps in the training of evaluation personnel that must be 
remedied if the quality of evaluations is to improve. 

• Traini 

Recommendation D-4. The Department of Education should 
provide funds for training programs in evaluation to 
incr e^ ie the skills of individuals currently Charged with 
carrying out or using evaluations and to increase the 
participation of minorities . 

This recommendation covers three training needs that 
require extramural support: recruitment ana training of 
minority individuals; training to improve t^ j 
communication between evaluator and the user of 
evaluations; and training for those currently involved in 
evaluations. Two related issues .ife covered in other 
recommendations: broader technical assistance to state 
and local agencies is discussed later in this chapter, 
and intramural training for federal evaluation and 
program staffs is discussed In Chapter 5v 

After 15 years, the rationale that there are no 
minority researcher;3 available to help evaluate education 
programs is not tenable. Their absence is particularly 
marked, and particularly detrimental, an the senior 
levels of both' sponsoring and performing organizations « 
There are increasing^ numbers of minority persons in 
trafnlng in Ph.D. programs in social and behavioral 
sciences, in part because of numerous federally sponsored 
fellowship programs.^ Th-k'se socia,! arid behavioral 
science graduate students very often express interest in 
"applied research," but do not often have an opportunity 
to learn about it. They represent a sizable pool of 
potential evaluation researchers who could staff 
positions in the Department of Education, who could 
advise and consult with local and state evaluation 
groups, and who could work with unive^-sities and private 
consulting (including 8-A) firms in carrying out 
evaluations. Fellowship and internship programs in 
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evaluation that include specific priorities for minority 
group persons would be doubly valuable. They would 
produce good researchers and they would enrich the 
evaluation system. Some of the current fellowship 
programs could include a special component for people 
studying evaluation^ and internships could be made 
available for people in their third or fourth year of 
doctoral study. Such internships might be coordinated 
through contractors, states, or local school systems 
doing evaluations of federal education programs. A 
percentage set-aside from evaluation contracts might be 
used as a pool of money for mounting such a national 
program. Alternatively, RPPs or grant announcements 
might require that such internships be budgeted and the 
training parameters specified. A feeder system through 
other federal fellowship programs concerned with 
increasing minority participation in social science 
research and development activities could also be 
initiated. 

The second training need concerns the relationship 
between the evaluator and the administrator or educator. 
There is often a communications gap between the two that 
renders ^e use of evaluation far less effective than it 
could be. This gap might be narrowed by appropriate 
training on both sides. Executives and program staff 
could benefit from greater knowledge of the language of 
evaluation and how evaluations can be used. Short 
training sequences on such topics might be develop<Sfd and 
made routinely available to new staff. Pot the 
evaluator, who often lacks experience in program 
management or delivery, exposure to the problems, 
procedures, and constraints of federal education programs 
would be similarly beneficial. In addition, training 
should be directed to improving both the interpersonal 
and the communication and reporting skills of the 
evaluator so that evaluation information is conveyed as 
usefully as possible. 

A third type of training is needed to assure a 
minimally adequate level of skills for persons newly 
assigned to evaluation responsibilities and to allow 
others to keep up with the field. Despite the entry into 
the field of many individuals without the requisite 
skills and the rapid development of evaluation 
techniques, which makes once-adequate skills obsolete, 
training in evaluation training is currently inadequate 
or unavailable. The Committee is less interested in the 
number of new graduate students recruited to the field 
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than In improving the sKills of current performers and 
usera. Sufficient numbers of staff trained in either 
rigorous evaluation methods or in research have never 
been available. As a consequence, evaluation is 

cur r entlyL-prac ticed by. people with almos t every type o f 

background possible, including many with no more 
preparation than that of classroom teaching. These 
practicing evaluators need opportunities to upgrade and 
improve their skills. (See Appendix C for details on 
training needs among local personnel and on some possible 
programs.) Insofar as new evaluators continue to be 
recruited, graduate-level training programs for 
evaluators will continue to need support. In part, such 
training would occur automatically through greater 
participation of the academic *;ector in evaluation work 
sponsored by the Department. 

The suggestions in this recommendation require the 
funding of extramural training and fellowship programs. 
One channel for such programs might be the Assistant 
Secretary for Educational Research and Improvement, 
either through the Office of Dissemination and 
Professional Improvement or through the National 
Institute of Education, which already runs a program to 
increase the participation of women and minorities in 
educational research and development (R&D) . 
Congressional authorization for such programs already 
exists, at least for NIE, in the 1980 Higher Education 
Amendments (P.L. 96-574), and in the Special Projects 
Act, though the latter requires that Congress be notified 
before a program is initiated. 



interorganizational Complexities 

There in an important difference between most social 
science research and evaluation. In most research, 
control of a study is mainly in the hands of the 
researchers: they decide what to study and how the 
research is conducted. Even when action sites like 
schools are involved, the researchers select them on the 
basis of the intended research design, and if some sites 
are unwilling to cooperate, others can be substituted. 
The funding agency's role is usually limited to 
negotiating grant amounts and requiring nominal progress 
reports. 

In evaluation, the researchers share control to a 
considerable extent with two other parties — the 
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sponsoring agency and the program or action agency. 
First, the sponsor sets conditions by designing the RFP 
that solicits the evaluation, including the level of 
effort, the scope of work, the types of issues, the 
research design and measures that are to be used, and the 
timing. Secondf the nature of the action program itself 
imposes constraints, including how funds are allocated 
within the program, how tar along it is in the 
implementation process, how much freedom is given to 
individual sites to carry out their own miniprograras . 
Third, the research team must work with a specific set of 
action sites. In order to establish workable 
relationships with action sites that may be reluctant 
participants, the researcher must provide a set of quid 
pro quos, such as collecting data not necessarily 
relevant to the evaluation study but wanted by people at 
the site, providing technical ,assistance, or carrying out 
special analyses. Moreover, neither the action site nor 
the sponsor is a monolithic entity, and different 
requirements and constraints may be imposed by different 
organizational units vrittrin each. Of particular 
importance is the increasing fragmentation of 
responsibilities within federal executive agencies (the 
usual typa of sp< as(.r) ^ in which at least three parties 
miy havG sone inlluence over the design and conduct of 
L*esei*rch^ the project- monitor for the evaluation study 
itself (and the cc^r.iz.int ,^vaiv.d'c ion unit), the program 
•^tantigei ana respor.sib^ of fie*? ':or the program being 
ev3llur)^ed, and the contracts off..vce. The reisulting 
context for evaluation i^ dipict'^ri schematically in 
Figure 2 (see Yin l^ibO), 

Ine quality of ev-^iuaticr"^ aj* subject to the marked 
constraint imposefl by ♦■he netv! tor researchers to work 
;ithin these interor9eir.izational complexities: each 
Jecisvon o be negotiated and agreed to by a number 

of pa tiet>. If nothing ^^Ise, the process of arriving at 
compromises, acceptably to all oarties is tlme^-consuming r 
oft< to a degree makes rhe original study <2esign no 

longer feasible; V.xis ^ ; especially true airing 
procurement phjis^^- and le implementatioh phase. 

The low participation of the a ladr'jmic .sect ? : ir 
evaluation w^rk. sn^ou^ J not trj surprising, avfS;^ chough 
^♦'^aafe.^i : organl.iatior s reprei?<tnt the largest sxr-^ie group 
cv p<:rc'>rrv*^s cf ^11 eUucci^tional research (/\poer*:^i)? 
B: Table ii'-4) , beca. ^.e of the. process by which evaluations^ 
are proc-i^ed by t.ne federal government. T'nat pvocei^o has 
become more and rore co; plex over the docdde of growti? in 
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FIGURE 2 The interorganizational complexities of 
evaluation research. 



evaluation funding. Requests for proposals (RFPs) have 
become longer and more detai'led: in addition to spelling 
out basic design, methodology, what to measure and how to 
measure it, they may specify the sites to be studied, the 
data elements to be analyzed, and the time intervals for 
different collection steps. Responders have little 
freedom to formulate research approaches they consider 
more appropriate, let alone to reframe evaluation 
questions. Moreover, the average response time allowed 
hardly permits such luxury: for eight of the ten RFPs 
isssued for new studies in fiscal 1980 by the Office of 
gram F valuation- (the central evaluation unit for the 
v,-yartment of Education), proposals were due only 1 month 
. af ter issuance of ^the Vt5P; for the other two RFPs, 
proposals were, due in 6 weeks (see Table 1). The 
proposed length ^f time for these studies ranges from 
18 months to 2-1/2 years an^ their ^^[pjected cost ranges 
from .$150 #000 to $2 million. The* largest of these 
studies, which comprises a whole series of substudies of 
the implementation of Title I at the state and local 
levels, is estimated to take 2-1/2 years and cost $2 
million. The RFP for this study was issued on July 23; 
proposals were due 29 days later, on August 22.^ 



TABLE 1 Miloatone Dates--Fiscal 1980 RFPs--Office of Program Evaluatioii 





Work 


Work 




Proposals 






Statement 


Statement 




Due 






First Dtuft 


Final Draft 


RFP 


(Closing 


Contract 




to GPMD^ 


to GPMD^ 


Issued 


Date) 


Award 


Developroont of bilingual evaluation models 


1/31 


2/13 


3/6 


4/7 


6/27 


Assessment of Women's Equity Act Program 


2/12 


2/22 


3/14 


4/14 


6/30 


Description of state management practlcea 












in ESEA Title I 


3/12 


3/24 


4/25 


5/27 


6/30 


Assessment of the Strengthening Developing 












Institutions Program 


3/31 


6/3 


6/25 


8/5 


0/30 


Evaluation of Basic Skills Improvement 












- Program 


,3/26 


6/5 


7/1 


8/4 


9/30 


Management studies of federal education 












programs 


4/9 


6/20 


7/10 


8/22 


9/30 


Evaluation of impact of Part A of - 












Indian Education Act 


^/30 


6/25 


7/18 


8/18 


9/30 


ESAA-funded activities and Management 












Information System 


5/27 


6/13 


7/3 


, 8/4 


9/30 


Description of ESEA Tide I district 












programs since 1978 


5/30 ' 


7/1 


7/2 3 


'8/22 


9/30 


Assessment of ESEA Title I Program 












for Handicapped 


5/30 


7/7 


7/2 3 


8/26 


1981^ 



^Grant and Procurement Management Division. 

^Originally planned for 9/30/80, postponed until fiscal 1981. 

ERIC ; 
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Tight timetables for preparation of major evaXuotion 
propoaala are the rule^ though the reaeons vary from year 
to year. In 1960, the cauae was a complicated internal 
planning proceaa combined witih the need to expend 
evaluation dollars during the fiscal year in which they 
were appropr iate^. Evaluation plans submitted by the 
Office of Program Evaluation in the spring ol 1979 were 
not approved until January of 1980; some studies were not 
approved until May. Therefore, except; foi. two RFPa that 
had been held over from fiscal 1979, no work statement 
could be Completed until March, and a number were delayed 
until June or July by further review within the Gran?, and 
Procurement Management DivisioHi the Department's 
contracts office. Thus, seven of ten planned awards for 
new studies were not scheduled until September, at the 
very clobe of the fiscal year. 

Institutions whose business is based on federal 
contracts resulting from HFPs and who have considerable 
staff resources assembled at any point have an obvious 
<idvantage when responses must be made in such a time 
frame. The recent change in the federal government' s 
fiscal year has positioned many complex procurement 
actions in the summer quarter, a period during which 
academic institutions are even less likely to be able to 
respond quickly. Contract records substantiate Sharp's 
findings (Appendix B) that universities and srtall-scale 
performers are largely shut out of t^e types of studies 
($100,000 and over) that have been In favor. Of 64 
contracts for evaluation and planning awarded by the 
central unit in 1979, only 1 went to ,a university, in the 
amount of $350,000 of a total of $21,526,069 in jiwards. 
On the other hand, one for-profit firm received four 
contracts for a total of more than $5 million. Nineteen 
contracts to three private firms and one large regional 
laboratory (also a private corporation) ^ 'accouri ;e<C for 
50 percent of all funds awarded. Through theU success ' 
in responding to evaluation RFPs, the private pf^r former 
organizations have been able to accumulate ^lai 
sophisticated, multidisciplinary staff which are very 
knowledgeable about the major educational issues' of the 
day" (Sharp, Appendix B:241) . Whether current 
procurement procedures with their tight deadlines and 
enormous response burdens serve to deploy effectively the 
talent pool in even this limited domain is open to 
question. The rev. vs of evaluation proposals cited 
earlier in this chapter are not reassuring about the 
quality of responses elicited by the procurement process. 
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Constrainta operate not only, during the procurement 
proceas and original deaign phase, but also during the 
eHecutiori of any atudy. The first obstacle after a study 
iu launched is to obtain clearances for data collection 
instruments. Clearance procedures (described in greater 
detail in Chapter 5) may take 5 to 6 months. Three of 
four different bodies are involved in the process, 
looking at the study deaign, the data collection 
instruments, and the analysis plan from a variety of 
perspectives! burden on respondents, technical quality, 
need to know (defined as being required by law) , and 
economic impact. Not infrequently, research designs and 
instruments that are the product of experts and that have 
been pilot tested are changed by reviewers who do not 
have equivalent expertise or field experience. If a 
atudy is to be done at all, many cpmpromises have to be 
made along the way by the contractor and federal monitor • 

In 1978, a new requirement was added to the clearance , 
process namely , that all test and data collection 
instruments to be used in a study must be de^i'^r Iberl in 
the Federal Register (and available on demand) by 
February 15 previous to the school year in which the 
information is to be collected.^ This requirement, 
when added to all the other clearance machinery, so 
compresses the time available for development of 
instruments and questionnaires that quality takes a back 
seat to doing the atudy at all. It also severely limits 
the possibility of making changes as a result of 
conditions in the field or as promising lines of inquiry 
develop during the course of a study. The added costs 
engendered by keeping key staff who are essentially 
unproductive as they await clearance to go into the field 
squanders time and money that could have gone into 
improved design, data collection instruments, .and 
analysis . " , 

Even past the hurdles of clearance, a funding^unit 
exercises great influence over the nature of evaluation 
studies through* the monitoring process. When unexpected 
conditions arise that may require changes, such changes 
will.be affected by agency officials because of their 
active role in approving or rejecting requested 
modifications. Decisions may be slow in coming, since 
most of them will require agreement among the three 
internal agency parties involved (evaluation monitor, 
program manager, and contracting officer). Agency 
officials and performers have to understand and resolve 
the tension between necessary changes in direction and 
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timely delivery ot an evaluation study; creative bKUIs " 
are required to negotiate auoh tenalona auooessfully 
without impairing the quality of the atudy. in oomec"# 
caaea, It may be more Important to deliver flndlngu hn 
time than to enaure that the reaults are as 
methodologically rigorous as possible. The balance 
between adequate agency procurement and monitoring 
procedures and creativity needed from the field to 
produce high**quallty evaluations has In recent years 
swung heavily toward agency control and, within the 
agency, to control by contracts and grants management 
specialists rather than by .technical evaluation staffs. 
The three reoomm(:,udatlons below are aimed at Introducing 
greater creativity and competence Into the evaluation 
proceaa during three stages; procurement, while a study 
proceeds, and after completion. 



\ Recommendatlon D-5. The Department of Education should 
atruoture the procurement and funding procedures for 
evaluations so as to permit more creative evaluation work 
by opening up the process and allowing a period for 
exploratory research . 

The Increasing constraints Imposed as a result of the 
greater visibility of evaluations and the attempts to 
control their management and process have limited 
contributions from the field of evaluation. These 
constraints have reduced the opportunity for Infusing 
novel approaches Into either programs or evaluations. 
They have also reduced the potential of evaluation to 
cohtrlbute to the policy process. 

The more complex the evaluation, the less likely Is It 
that anyone can spell out ahead of time the best methods 
for addressing the questions that the ' evaluation Is 
designed to answer. The current RFP process in 

^particular Ignores this fact. The Committee believes 
that this process can be made more flexible. An RFP 

^ often presumes some things about the program are known 
when they are not. This can range from something 
fundamental — 6.g., existence of the program at a site — to 
something trivial — e.g., existence of records. RFPs also 
often downplay the possible effects of Interorganlza- 
tlonal relationships on the evaluation process. In 
addition, problems and Issues In executing the evaluation 
are not anticipated, and many cannot be anticipated. The 
unknowns or unknowables suggest that an RFP that attempts 
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to bo apeolHio is bound to be Inappropr iato . TharoCoro 
RPPa should include a period of exploratory research 
beCoce the evaluation is undertaken in order to frame 
queations properly-- with the aid of the oonaultation 
.>proceBe suggested in Recommendation D-6 below— and to 
figure out what the unknowns are, RFPa should alao 
provide for aide studies that are research oriented to 
illuminate queations that emerge during the evaluation 
and that should be answered if the evaluation is to be 
done we] I . . 

preoedenta for encouraging exploratory research befor 
an evaluation ia undertaken exist i James Coleman h^d the 
benefit of 1 year of planning for his national 
longitudinal study of the high school class of 1980 
(Coleman et al. 1979). That planning included intensive 
research on what kinds of policy issues could be 
addressed in the future using such data. As another 
example, the NIE compensatory education study (National 
Institute of Education 1976) had 6 months to clarify 
questions before the study was initiated. 

Mechaniama for providing opportunity for expertise in 
evaluation to improve the quality of evaluations include! 

• inviting bidders to specify alternative methods 
of evaluating the program at hand and how such methods 
would be tested, in addition to asking that they meet 
formal RPP requirements; 

inviting, bidders to design small side studies 
that can lead to durable general statements about 
particular approaches and providing support for those 
side studies found to be meritorious; 

• alssuring that sufficient time is available for. 
develqp^ng proposals for an evaluation project, at least 
6 month^ for complex evaluations; 

• ^issuing RFPs for pre-evaluation assessments^ that 
defiflfe JKe problem better, lay out alternative approaches 
to evaluation and how they might be assayed, and so forth. 

Beyond i^rc^vihcf the RPP process, there are other 
steps the' Del^rtme: t should take to* introduce greater 
creativity. The procurement process now used by the 
Department to obtain most evaluation studies virtually 
limits all contract applications to organizations that 
have the capacity to assign full-time specialists- who can 
. be immediately responsive to RPPs. Under this ^ystem, 
the evaluation program is effectively cut off from the 
academic *?community, which has made major contributions to 
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th« thvory And mtthoaology oC evaluAtlons, It haa b«en 
Argutd thAt aoademia raattArahoco aco dlalntecaAtAd In 
Appliod roaAAroh auoh Aa •VAluAtloni alnaa thoy Ara mora 
highly rawArdad for bAalo raaaarahi and that tha 
dlaalpXinary atruatura of unlvaraltlea doea not land 
ItaalC to poliay-ralavAnt raaearoh. Though thara la aoma 
juatlClOAtlon Cor thaaa vlawBi ona oannot oonoluda that 
unlvaraltlaa will not and ahould not partlalpAte In 
OArrylng out avalUAtlona. Tha AOAdamlc world la no mora 
fnonollthlc than any othar community; within many ^ 
unlvaraltleai thara are Inatltutaa or oantara areatad 
praolaaly to respond to the Interdlsolpllni^iry challengaa 
of applied aoolal aolenca raaearoh* In addition, aa 
funding for baalo reaearoh haa leveled off or even 
d^oraasedf academlo resaarchera have become more 
Interested In applied work. The dlamal statistics on 
lack of participation by universities in evaluations 
funded by the Department cannot be attributed solely to 
the unwillingness of universities to partlclpato. 

By depending almost entirely on thtt competitive RFP 
procurement ayatem, the Department is not able to take 
advantage of the creativity, objeotivityr long-term 
coipitment, and the cumulative knowledge and experience 
of the academic community. Noc can it'attraot 
PArtioipation by minority reaearchers, whose perspectives 
would enrich the questions and methods of evaluations, 
who are not able to assemble the resources needed for 
large studies in the time provided. Local and state 
Agencies Also cannot often contribute At the national 
level, even when they hAve thtai cap Ability for high- 
quAlity work, because of the site requirements in msny 
RFPs. Among the mechAnisms for funding evAlUAtiond that 
cAn be used to open up the process And improve quAlity 
Are unsolicited proposAls, sole-source AWArds, 8-A 
contracting* cooperAtive Agreecsnts,^^ basic ordering 
Agreements, And grAnts. 

The DepArtment should consider unsolicited ]^of>osals 
in order to encourAge creAtive and innovAtive idea's that 
may be lost through the RFP system. Academic experts who 
have made significant contributions to the evaluation 
process ahould be encouraged to ^submit proposals that 
attempt to break new paths in theory or measurement of 
the Effectiveness of education and other social 
programs* It is possible to carry out a CQQ)^titive 
prograAt of grant awards for unsolicited proposals in 
specified areas, as practiced by the National Institute 
of- Education. 
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When the Department wAn^.e to take^^dv^ntiiqe p£ the 
QKpert Knowledge ot ^n aoAdemla QoholAt who may hava made* 
a aignlCloant oontrlbwtton to a peirtiauiar aubjeot area^ 
It ehould havd the authority to aoliait a apeaiCio 
proposal. Home inembere o£ tha aaademio aommunity have 
unique Knowledge and akilU that are not Sound 
alaiiwherev The Department should have the authority to 
o£for a aole^aouraa award to a »»ahoXar in the £leld ot 
evaluation whose ^livoKgroundr experienaor and enpertiae 
cannot ba matched, Tha uae of thie meohariiam will help 
to opan up the ayatem to new Ideaa and contribute aorely 
needed (flexibility to the Department^a evaluation 
ac^^^ivitiea. The Committee ia Cully aware of recent 
ortitioiama o£ oonaulting and aole'^eource procurement 
\u»S. General Accounting Olffioe 1980ar Gup and Neumann 
rDBOr but aee Wilaon 1980), We believe, however » that 
the limited and judicioua uae of this mechanism can - 
produce gains that far 6utweic]h the risk of occasional 
abJbe. When abuse does 'ariae, it should be dealt with on 
a cue-by--aase basis, not by Abandoning a useful ^ 
proomrement mechanism, 

Th\ restrictiveneas of the RFp process also 
oontrlDUtes to the very low uae of minority firms by the 
Department in securing evaluation contracts. Such firms 
are usually small and have limited staff and so they 
cannot aespond as quickly to RFPs as the larger 
for-prorit organizations that now dominate the evaluation 
field. The 8-;A contracting process seems to be seldom 
used as a way of involving more minority firms, probably 
because evaluation studies have tended to be large scale 
and firms are small. The issue of equal educational 
opportunity that calls for the greater use and 
involvement of minority researchers will only be resolved 
when more flexibility is built into the design of studies 
and the contracting process. 

Cooperative agreements ought to be the mechanisjn bf 
choice when the principal purpose of the award is to 
benefit local or state operation of education programs 
authorized by federal statute. Such agreements may also 
be used when substantial involvement is anticipated by 
the federal agency as well as by the recipient of the 
funds. Studies carried out by a state or local agency to 
document program processes, improve program 
implementation, or test program alternatives are intended 
* to benefit the locality, but they can also help improve 
the program nationally. The former Department of Health, 
Education, and^^lfare had an internal decree against 
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ooop«r«tiv« dqrafim«nt», iihough thay ate Ujaail Muah 

I^ftw PJnforo«im«nt AuMlat^nce Admin U^t^4t;lun ot th^ 
P«p«rl;imii\fe JuHtlaii, Th<a Department Kduo^tlon 
flhpuld exploit the potentiAl ot thla praouremant 
mnQh^nlflm, Coopecattv© agremnonte are an obvloua volUole 
for encouraging luo«l and etatu ageiiplee that have the 
oapaolty to undertake evaluation work aimed at proyram 
Imiirovementi 

Uaelo ordering agreemente are a particularly useful 
meohanlem for planning or evaluabllity efcudloni and other 
limited work with a ahort time horlaon. The Department 
oould obtain greater CleKlbllUy and Caator turn-around 
time by maintaining liete oC qualified performero 
generated through periodic roqueeta for quollf ioatione 
(RFgo), Theae performers aould then be called upon for 
limited studies. 

Grants are a partioularly appropriate mechanism when 
creativity from the performer la important. The 
Committee urges that the Department institute at least 
two grant programsi one for local and state agencies (see 
Reooimnendatlon C-3 below) and a small grants program 
(350,000-100,000 per grant) to allow university 
researchers and others to pursue evaluation questions in 
designated areas of interest to the Department. The 
small-grants program should be run in conjunction with 
the research pr.bgram at NIG suggested in 
Recommendation D-3 (in Chapter 2). Research grants are 
often considere/d to be appropriate only when the primary 
audience is to be other researchers and hence are 
considered inappropriate for policy-related reseach. But 
grant programs do not have to be untargeted, as is 
demonstrated by the well-defined grant programs developed 
*jy the various study sections of the National Institutes 
of Health and of Mental Health. Not infrequently, the 
research is both applied and immediately applicable, as 
in the case of \ the restorative materials program funded 
by the National Institute of Dental Research. 

The state and local program we are recommending could 
be in the form oL* grant awards or cooperative 
agreements. The purpose would be to allow selected 
agencies to r^cudy their own federally supported programs 
by documenting jwhat actually goes on in the program at 
the classroom <^r school level, assessing the effects of 
the program or jsome of its components, and testing 
alternative program interventions. There should be 
national or regional competitions for each large federal 
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titU ^nd on«i Pfttoh-All category tax tho flm«U pFoqr«ma. 

PftoaU ot outttltlo tnHi^tfirtM (InalucUntji nonf^^^^r^l 

r««earahei:«) «houW ©valufttrO propoanU, Pirqpo^AU 8hOMl4 

hti riiqulr^d to afcftt© how reaulta « atudy will he 

InaQFpuFAt^U into p^ctlnant loaal or at«t<J agenay 

operation. The Depactineit whouUI use axlttting mechanl«ma 

like tttttte dgenay dlaeeminAtlon arme, aaeUtanae aantece 

attached to vacipua Cedleral education progcama, ar tho 

National DlCCualan Network (NDN) to dlaaeminate and apply 

trindinga nationally. 

'ime dbmmlttee'a raaominendatlon that a greater variety 

oC prpauroment methoda bo employed doea not auggoat that 

the uwe of RPPa be draatloaXiy reduced. We rooognlao the 

need Cor urganisgatlona that can nv ant nationwide aurveya^ 

carry out compleH taaka, and have available large numbera 

ot experienced anaiyata. Our call for CleKlbility in the 

procurement procaaor we believe, will rt iuge the 

aterility of the evaluatiipn ayatein t-H^ uigh the 

Introduction of new ideaa and will petm.t Increased 

conaideratlon of different perspeotlvea that can 

contribute to the educational system. 

i 
I 

I Review 

,A common defept in past evaluations has been that only a 
•*amall group of people in the agencies and among the 
contractors are. talking to each other; they are doing 
things in standard ways and perhaps raisaing new 
developments in technique or new ways of evaluating or 
running programs. The results of evaluations are then 
made available and often taken on faith by the 
educational community. Since evaluation is a difficult 
and ambiguous activity, the evaluation process would/ in 
the Committee^s view, be improved by opening it up—^ven 
if this results in longer time frames. 



Recommendation D-6. All major national evaluatioris 
should be reviewed by independent groups at the design, 
award, and final report stages. Review groups should 
include representatives of minorities and other consumers 
as well aV technical experts. The resulta of their 

review should be made brdadly available . 

f — ■ — - 

Insofar a« it is feasible, such reviews should also be 
conducted for major state and locally sponsored 
evaluations . 
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This recommendation has three facets to iti improving 
the technical quality of evaluations, ensuring early 
contribution jnd involvement from those roost affected by 
the prograiri (beneficiary groups, teachers, etc.), and 
making use of the findings more likely through public 
exposure and understanding. 

For major national evaluations of important programs, 
the evaluation plan should be publicized by the agency 
before the project begins, when the RPP process is used, 
the agency itself should solicit as much outside advice 
as possible, thorough development of concept papers, 
planning conferences, and other pre-RFP activities. 
Proposal review should include experts from outside the 
sponsoring agency. After award of a contract, the 
contractor also should solicit the views of outsiders. 
Some questionable assumptions or pedestrian analytical 
approaches might be amended at this point. Then, when 
the project is done, outsiders should again review the 
work, its philosophical perspective, its technical 
ambiguities, and its policy implications. Such outside 
review would be facilitated if researchers were careful 
to spell out, in final reports, the limitations of their 
research: "... what went wrong, what couldn't be done, 
what that means for the conclusiveness of the findings 
and . . . for their generalizability to particular 
populations" (Chelimsky 1978). 'Later on, the data frorh 
the evaluation should be made available to others for 
reanalysis. If , evaluations are controversial, either 
because of their execution or because of their 
recommendations, this process will allow such 
controversies to be aired. All of the results of this 
interchange, the evaluators' reports and the comments of 
outsiders, should then be made broadly available. 

There may be several ways to ensure adequate input and 
broad availability. One approach worth exploring is for 
the Department to sponsor an annual 'conference on 
important evaluations that are at various stages in the 
process — design, first completion,^ reanalysis. If this 
were done, the educational community would know where to 
look for the latest evaluation results, criticisms, and 
reanalyses, as well as for information about impending 
work • 

In line with previous remarks about the subjective 
nature of evaluation quality, opening up the evaluation 
process should provide mechanisms similar to those 
employed by such journals as Congrumer Reports with regard 
to the market for consumption goods. The Department 
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she lid not be the arbiter of evaluation quality. But it 
can make sure that all evaluations are subjected to the 
scrutiny of outsiders so that the educational and 
beneficiary communities at large, as consumers of 
evaluation information, can see the pros and cons, the 
ambiguities and questions, and make up their own minds. 
In the long run, this greater information and exposure is 
the surest way to make certain that evaluations will 
consider the perspectives of parties at interest, will be 
of high quality, and will not be ignored. 

This recommendation implies that evaluations will not 
generally result in an immediate consensus on the value 
of an education program. To a certain extent, this lack 
of consensus is a fact of life in the field of education, 
and the Committee would be remiss if it did not warn 
Congress and the Department of Education of this fact. 
But we see in the suggested mechanism some ways of trying 
to resolve the real controversies. As part of a 
subsequent r^analysis process, conference participants 
might try to^^gree in advance on further analyses to be 
done and what they could show. In that way, there might 
be a greater chance of arriving at agreement on the 
results of the second round of tests and analyses. The a 
same logic also applies to the idea of presenting 
evaluation plans: It is likely that when more voices are 
heard early on, less acrimony will be heard later on. 



Recommendation D-7. All statistical data <^ enerated by 
major evaluations should be made read ily available for 
independent analysis after identifying information on 
individual respondcnta has been deleted . 

When possible, ethnographic data and case study 
material, similarly treated to protect privacy and 
confidentiality, should also be made available. 

The data generated in most large-scale evaluations are 
an expensive resource and should be treated as such. 
They can be reanalyzed in the interests of critical 
appraisal of the original evaluation and in the interest 
of advancing the theory of program testing and the state 
of the art in evaluation. They can be useful for 
pedagogical purposes in university trainini and for staff 
development in government and in state and local 
education agencies. Mechanisms for ensuring that the 
data are available for reanalysis include: provision of 
support for documentation, storage, and dissemination of 
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data in major evaluation contracts; creation of explicit 
agency policy on access to data; and statutory 
requirements for independent review andf where 
appropriate, roanalysls of oricjlnal evaluation data. 

Independent reanalysis of data generated by 
evaluations should capitalize on procedures that avoid 
compromising the privacy of individuals or the 
confidentiality of information. Audit agencies such as 
GAO, or independent researchers, may have a legitimate 
interest in verifying quality of data generated in an 
evaluation • The process need not and iihould not breach 
promises of ponf identiAlity made to individual 
respondents or invade their privacy. A report 
commissioned by the GAO on assessing evaluation quality 
(Social Science Research Council 1978) recognizes the 
additional needs of avoiding needless disruption of 
research and harassment of respondents. • The report 
recommends several alternatives to the usual way of 
reinterviowing respondents including i independent 
sampling of the target population to compare statistical 
results obtained by the auditor with statistical results 
obtained by the evaluator; use of evaluators independent 
of both original evaluation staff and audit staff for 
reinterviews; drawjlng a subsample of the original sample 
for reinterview to minimize disruption of the research; 
and other strategies. In many intances, regathering of 
primary data is unnecessary: review of design, 
execution, and analysis is sufficient for judging the 
quality of major program evaluations (see also Hedrick et 
^1. 1979). The critical point is that original 
evaluation information not be withheld by researchers, 
sponsors, or any other parties? the more such information 
is available, the less intrusive can be the approach 
taken in reanalysis and critical appraisal. 



STATE AND UX:AL ACTIVITIES 

Funding and Independence 

The amount of federal money spent for evaluation 
activities at state and local levels is not 
inconsiderable. Webster and Stuff lebeam (1978; see 
Appendix Ci Figure C-3) found that 35 large urban school 
districts spent a total of nearly $34 million on research 
and evaluation, of which $21 million (or more than 
two-thirds) was federal funds. But funding for 
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evaluation varies widely. The 8i25e of local education 
agency (LEA) budgets £or the evaluation of Title I 
programs has ranged from 0 to nearly $1 million for 
programs that have a total budget of more than $100,000 
to 352 million, respectively (Drezek et al. 1980; see 
Appendix C) • There is also great variability for 
different. programs) for example, an average of 1 percent 
of program funflis is spent at the loc4l level for 
evaluation of p.L. 94-142, the Education of All 
Handicapped Children Act, and 7 percent for ESEA Title 
IVC, innovative practices and curriculum. Much of . the 
evaluation money made available through federal programs 
is controlled by the state or local program 
administrators. This tends to pat the evaluators in 
competition with program administrators for resources. 
Evaluation projects may be approved or disapproved on the 
basis of their acceptability to the officials who run the 
programs. Bernstein and Freeman (1975) suggest that it 
is advantageous to have the, program staff play a role in 
the research process, preferably by having both the 
program and the evaluation units be part of the same 
overall organization. But unless an evaluation unit can 
operate with some independence within the overall 
organization and is given direct access to the leadership 
of the organization, it cannot (and will not) be trusted 
to produce credible work. 



Reconunendation C-2. Congress should separate funding for 
evaluations conducted at the state and local levels from 
program and administrative funds . 

The first reason for this recommendation is that such 
a separation will allow greater accountability for how 
evaluation money is being spent and who spends it. The 
current arrangement for most programs is to have 
evaluation money come from local program funds or from 
state administrative funds. No separate accounting is 
necessary. This makes it impossible to know how much of 
the federal money potentially available for evaluation is 
actually used for that purpose at the state and local 
levels. It is therefore impossible to judge whether 
inadequate performance of specified evaluation tasks 
comes about through lack of funds« inadequate training, 
or other factors. 

The second reason for the separation is to introduce 
greater integrity to state and local evaluations. Under 
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present cixcumstanceB , v^hatever amount of money Is 
invested in evaluation is, in too many instances, 
controlled by those who administer and run programs. 
This puts the quality and credibility of evaluation 
activities in jeopardy. As'long as program 
administrators control evaluat.^on funds, resulting 
evaluation activities will be audpect. If evaluation Is 
to be rfh independent function that can provide an outside 
view of program opekations and effects , it must be 
separately funded. ' 

As a specific way of accompliBhing the separation. 
Congress may wish consider a required percentage 
set-aside for each program that would be devoted to 
evaluation activiti,e.ii at the state and at the local 
levels, with due considnration of thresholds below which 
no activity can be carried out adequately. Such a 
set-aside provision should be accompanied by reporting 
requirements that account fox the money spent and that 
summarize evaluation results and their application. Over 
time, it will then be possible to judge whether the 
investment in evaJudtion is yielding the desired results 
in terms of program monitoring and improvement. 



Capability 

The competence and resources of the personnel charged 
with evaluation responsibilities constrains their ability 
to produce evaluations of acceptable quality. Only some 
school districts, paTtid|ii!Ll«rly the large urban or 
suburban systems, have veli-trained and sophisticated 
evaluators. For many waller agencies with limited 
resources, staffing is In^^deqfaate for any of the complex 
evaluation taskr such <as process or impact assessments. 
As Holley (AppemUK Ct258} ncVes: 



In most states certif icai'':ion standards are 
applied to personnel in ft:ii^::Mi programs. For 
example, a counselor, admiDlvjtrator , or supervisor 
must be certified! to fill those roles in most 
states, in^generfil, evaluafcoi'a are not certified 
and no such standards are applied to the personnel 
filling the zol^ of evaluator« m some LEAs and 
SEAS, the' federal program director or coordinator 
may bear full responsibility for evaluation and 
even in agencies with substantial evaluation units, 
small federal evaluations may be completed by 
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program ataff. Typically, where program staff are 
glvqn the responalblllty for evaluatloni they will 
have neither training nbr experience in evaluation, 
methodologyf measurement, nor statistical 
analysis. The author has observed many small 
school districts in which the person charged with 
Title I program evaluation is a reading teacher # 
not only with no training in evaluation, but with a 
weak background in mathematics. 

Even when third-party evaluations are used, this does not 
ensure either lack of bias or high quality, since school 
personnel charged with selecting contractors may or may 
not apply appropriate selection criteria* Moreover, the 
competency of personnel in contracting organizations used 
by local: systems varies as much as that in the system? 
themselves. 

State agenciea, in addition to carrying on their own 
mandated and discretionary activities, are also charged 
with a variety of responsibilities with respect to 
evaluations carried out by local school systems. 
Depending on the legi8].ative provisions in a given 
federal program^ these may include "monitoring the 
compliance of its districts with federal evaluation 
guidelines, aggregating, analyzing and reporting data on 
the state-wide impact of federal programs, and ensuring 
that LEAs receive proper technical assistance in program 
development and evaluation efforts" (Piun, Cordray, and 
Boruch, Ch. 4i7 in Boruch and Cordray 1980). The size 
and capability of evaluation staffs vary considerably 
from state to state, and it is not necessarily 
proportional to the school enrollment or to the number of 
federal programs adminidti^red. Many states do not have 
the capability to do more than minimtally comply with 
federal requirements » that is, forwarding the data 
supplied by the loc-aA agencies. 



Reconunendation Co^;qresg gi^ould institute a 

diversified strategy of e-^/aluat:ion at the state and local 
levels that would impose roiniroum monitoring and 
compliance reguireroontu on all agencies receiving f ederr V 
funds > but allow onl v the roost competent to carry out 
complex evaluation tftsks * 

The Congress should require the Department of 
Education to submit detailed program performance data. 
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Therefore, all state' and local agenolea receiving federal 
funda for education programo ahould be required to 
provide aooounta of the allooatlon of program fuada and 
of program coverage. When apeclflc services and 
procedure? are mandated i theae too ahould be aaaeaaed for 
compliance with the law (Implementation accountability). 
To accomplish thla req^lrementi It may be neoeasary to 
"apell out In legislation dealing with evaluation 
aotlvltlea the reaourcea, coveraQe or target groupa, and 
program services to be reported on by each recipient unit 
(local education agency, atate education agency, 
oommv;nlty baaed organisation, or other public ot private 
agenclea) • Congresa should' also require that the 
Department Institute quality control procedures that will 
ensure usable and comparable data on program funding and 
covejcaget \ 

Evaluation tasks that go beyond accountability 
questions — for example, the as.^«sament of educational 
Impact or the Identification and testing of alternatives 
that might lead to improved programs — ahould be a 
selective activity rather than imposed on all, regardless 
of competence and funda available. This recommendation 
is not meant to suggest that creativity in providing 
effective education cannot be found in school systems 
with limited ceaources. Inventive teachers and 
adminiatrators lave always found ways of applying the 
leaaons learned through experience to their classes and 
their programs, but they do not do it through formalized 
evaluation (David 1978). The task of understanding 
promising approaches and applying such understanding to 
program improvement^ various sites is an 'extremely 
complex one that needs considerable investment of fiscal 
resources and the skill of highly trained people who are 
unlikely to be available to every school system and state 
agency in the country. Nor- is it necessary that every 
site carry out that type of evaluation. If more were 
known about how to provide effective services through 
studies carriecP out at a limited number of sites and if 
school systems were then encouraged to try those 
(Vl ternatives that appeared most promising, program 
improvement could be expected. 

The description by Holley (Appendix C) of three 
alternative means of funding local evaluations documents 
the utility of providing discretionary funds on a 
competitive 'basis for program- improvement. Congress may 
wish to consider authorizing a grants program for school 
systems that would allow funding of the most promising 
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propoBftla for ppogc«m improvement banod on evaluation ot 
program alternativea that appear to be eCCaotive in a 
given context (aee Recommendation D-5 above). 

Recommendation P-8. The Department of Education ahould 
explore alternative approachea to technical aaaistance 
for atate and local evaluation needs * 

The technical aaaiatance needs of atate and local 
agenciea are not uniform. They vary with the ajize of the 
agency, the aophiatication of the agency^a evaluation 
ataff, and with the complexity of the federal program 
activity in the agency. The regionally baaed technical 
aaaiatance centera aaaociated with Title I are one 
approach to meeting auch needa. Whether the TACs are the 
beat form of aaaiatance for all agency typea and aizea 
and whether the aervicea they provide are adequate to all 
needa ahould be explored more extensively.^^ For 
example, the development of technical.aaaiatance 
capabilitiea in atate agenciea that alao have authority 
and reaponaibility for auperviaing local activitiea might 
be a more reaaonable ..and effective alternative. The 
National Institute of Education uaed ouch a atrategy in 
building diaaemination capacity within atate agenciea 
(Raizen 1979). Or the aupport of atate, regional, or 
national networka of evaluatora might, permit the joint 
exploration' of complex problema for which aolutiona do 
not yet QXlst (see Appendix C). Or aemlnara that bring 
together evaluation practitioners with repreaentativea 
from a number of different diaciplinea could increaae the 
awareneas of alternative research techniquea that might 
be brought to bear on complex problema and iaaues. 

Technical assistance should also encompass 
organizational and personnel questions. Evaluatc^s are 
often recruited and hired by people with little 
understanding of the skills required in the practice of 
evaluation. Personnel officers may, for example, be 
unaware of the types of degree they should require or of 
' the types of candidates to interview. Consultants are 
hired to do evaluations, but their qualifications and 
training must frequently be reviewed by staff members 
unacquainted with evaluation. The relationship of 
evaluators or an evaluation unit to program 
administrators, executives of an education agency, its 
governing board, and public groups are often not 
carefully considered or are submerged in more powerful 
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aarafully aonaidoced oi: Ar« aubmcicqod in mocft powtrCul 
brganiMtional oonaldoratilona. Tiohnlcal asaijstanoa in 
tha araa of •valuation oirganiaation and pfcaonnel 
polioiaa oould draw on tf^^ah worK dona i|ilraady by aoma 
atata and looal aganoiaa aa to optimal ijiatitutional 
arrangamanta^ pacaonnal caqulciimantai and pcooucamant 
poliolaa tot axtcamural work, 

• In partioulac, <«tata and looal aganoiaa naad to 
awara ot tha daaicabilitft ot aapacatihg tha avaluation 
unit Crom program adminiatration. Bapaoially in tha oaaa 
ot impaot aaaaaamanti thara ia an obvioua oonCliot of 
both intalliictual and monatary intaraat« Bvaluatora ' 
should In ganaral ba outaida avaluatorSf and aualuatlona 
ahould not ba oontrollad by tha program adminiatratora, 
Tha oaaa la mora ambiguoua for formativa 
•valuationa—thoaa that ara almad at improving programs, 
Reaponaibla program adminiatratora ahould ba doing this 
kind -of aalf-avaluation aa a mattar of oouraaf but there 
ace alao powerful advantages of having outaidera do this, 
kind of eval'uationi outaidera bring a fresh and unbiased 
view and are likely to see new ways of solving problems 
in program administration and new approaohes for 
improving program benefits, they are also not 
constrained to cover '\ip inadequate performance, as 
internal evaluators may be inclined to do. The best 
approach may be to encourage continuing in-house 
evaluation efforts*, but also to encourage agencies to 
mak^ greater use of qualified outside evaluators. 
Technical assistance should help agencies organize their 
evaluation activities in such ways that they can derive 
the maximum benefit from their (and the federal) 
investment in this area. 



Recommendation Congress ahould require an annual 

report from the Department of Education on all evaluation 
expenditures and activities, including those at the state 
and local levels. 

The current evaluation repbrt delivered to the 
Congress 'annually should be expanded to cover all the 
evaluation activities within the Department as well as 
those carried out by state and local agencies with 
federal education funds. Past annual reports have 
concentrated on the activities of the central evaluation 
unit; they have not been comprehensive with respect to"^ 
evaluation activities carried out elsewhere in the 
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PapArtm^ntf Moro linportAntlyi no ^nAlogouia report iu now 
r^qililrod oC evaluation AQtlvltiea aarVlod out at th«i 
lodAl Ahct^ atate IjOyelai oven Clguree on eedecal dollar a 
spent on evaluiitlon at these lieveia ar« unobtainablei let 
alone any aubetantlve aaQount o( either ^mandated or 
dleoretlonary aatlvltlea* It la the'reCore Impoeelble to 
dleoern to what effeotv evaluation dollars are uaed at 
theee levela except through apeoial atudlea. Until more 
complete aooounta'are available oC the total extent and 
nature ot ,the; aotlvltiea oarried outi the quality and ^ 
managementll of evaluation cannot be improved* 

The. D#^artmont:*8 report ahould apecify the amounta of 
federal .dollars' spent for evaluation at, the natlonali 
statei and local levela r and breakdowna of funding ahould 
be given by type of activity SummaVies of atudlea under 
wayi findir^j^a^and orltlquea related to completed .studieai 
and their application to improvement of the aubatance and 
management of programa ahould alao be included in the' 
report. In addition^ Congreaa may wiah to requeat a 
brief report or apecial section on "What H^s Been 
Learned,** which draws from air relevant sources of 
knowledge—including evaluatfion and research not ' , 

supported through federal education funds — to consider ^ 
how programs can be made more effective through changes 
in legislation I management i or program strategy. 



, NOTES 

/ 

1 The cited studies cover several .social service 
^fields. Evaluations in education may in fact have a 

better record than some others. Rezmovic^ (1979) i in 
summarizing reviews dt evaluation studies in the 
criminal justice fields finds that there *are very few 
if any studies without aerious 'shortcomings that 
jeopardize the credibility of study ^results.. She 
cites Logan (1972)^ who examined 100 correctional 
research studies and found not one that met* minimal 
mf4|:hodological requirements 'for testing effectiveness. 

2 use Sharp's definition^ (see Appemdix B) of 
"priv^e-sector performers": all those not connected 
wit^ a university or with a public education agency i 
local or state. ^ , . 

3 Major performers are defined as those that spend $1 
million or more on educational research and 
development. 
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4 Thi ttrm 8-A to 4 ^pnalAl form ot 

k nonoomputitivt AWArdA, An ^rm in a fliuAll 

Cor«"p(oeit bUAln«Aa QonQArn that: 1« ownodf oontrolUc^i 
And op«rAfetd by onA or morA pArAQn(0) who aca AOoiAlly 
And AoanoinioAl).y dlAAdvAntAQAdi To bA AllglblAf thA 
oonoAcrv muAt Hava AUbmlttAd a bwAinAAA dAVAlopmAnt 
plAn to thA Umll DuAlnAAA AdmlnlAtrAtlon {Bi^h) 9 whloh 
mnt i>AVA AppcovAd it Coc SBA AAalAtAnoAi An 8-A Cinft 
QAn bA AA^AOtAd to dAliVAc goodA or AArvlQAA to thA 
fiAdArAl govAcnmAnt without hAvlng to compAtA with 
othAc CirmAi 

^5 A CASOUCOA llAt oompilAd by NIB ot minority Cirma 
oompAtAnt to do Rf^D work in AduoAtion during thAt 
PAriod oontAinAd 185 entriAai About two^thlrda wera/ 
8"-A oartifiad. 

6 Soma of thaaa programa are tha araduata and 
'Vrofaaaional Opportunitiaa Program (QPOP) in tha 
Dapartmant of Education , tha Minority Fallowahlp 
Programa in NIMHi'tha Minority l^oatdootoral FeXlowahip 
Program and the Women and' Minoritiea Program In NIEf 
the Minority Aaaaaa to Hasoarch Careara (MARC) in both 
NIH and ADAMHAf the Minority Fellowship Program in 
NSPf and thaHealth Center Opportunities Grants (HCOG) 
in HRA. 

7 This information f including the dates given in Table 
l„was provided by Priacilla (Pat) B. Deverf 
Administrative Officer, Office of Program Evaluation, 
U.S. Department of Education. We are grateful for her 
help and patience .in responding to our inquiries. 

8 The 16 regional educational laboratories and^ R&D 
centers have a special relationship with the federal 
government through which they receive core funding 
outside the comqbtitive prpcc^ss, ^some of it for 
evaluation studies, though they may — and seVeral 
do— also bid on RFPs. Of ten $5-million-plus 
performers of educational R&D, two are regional 
laborator^ies; nearly all these institutions fall into 
the $1 -million and Jover qr "major performer" 
category. Because they have long-term relationships 

^ with the Departmenti they are in a favorable position 
to receive contracts for evaluation work. 
„9 This provision was enacted at the behest of^ state 

e^ubation agencies so that they could plan adequately 
for theirjown data collection systems'. It is 
questionable, however^ whether e^valuation studies that 
gather. one-time information (even if collected more 
than once, as in pre- and post-testing or in " 



Ut\nin iei^^gt; thtia^ (liit;^ i4y§ti^ina Any ^Mt^ntt 

10 A qqaperativa Agreement; is a fcypei AW^rd ab An 
<^U«irnAt;lv§ to a gonl^FAQt wh«in a pra:)iot; r«i^iuU@» 
^vtU^tAntiAl invplv^sm^nt ot th^ Bpon^arlnq f^diirAl 
Ag^ngy during prp:)e0t pergprmAng©, "fluUateAntUl 
Invplv^mon^ inAy b«» niia9»»Ai'y b^PAUAfi ^\^^ pra:)«»gt it^ 

.teahniPAlly or mAnAgnclAlly gprnpUn oc requires gloA« 
qoorcllnAtipn with other federally Aponaored work. 
lilHAinpUii Are poUgy AtiKVieA, projeQt;^ requiring 
QoitjpUK ^ubPontrAotingr lArge aurrlaulum projepte^ And 
evAlUAtiona ot CederAl programe. t'or a detailed 
derinitionr aee P.L. 05*^224 . 

11 A bAsia ordering Agreement ia a written inatrument of 

I underetAndinge between t;)ie government And a contrAQtor 



\ that aete forth negotiated olauaea to be appliQAble in 
\ £v)ture contraota, including a de^eaription o£ euppliee 
or aervioee to be Curniahed *and of the method Cor 
determining £eea to be paidt Thia inatrument ia 
generally used in oonjunotion with a aeieoted group o£ 
6oi;\^tr actors £ound to be qualified to furnish the 
specified supplies or servioea when needed. 
12 A recent evaluation of the TACs (HOPE Associates 
1979}60) found diverse .views of their effectiveness 
among state agency personnel. One of the reviewing 
panel's recommendations was that 

• • • the Office of Education begin to 
investigate; during the period of the next 
contracts for Technical Assistance Centers i 
the possibility of a future system that has 
flexibility to accommodate toi the diversity 
of state and local capabilities and needSi 
and^also the enlarged objectives of Title I 
evaluation technical assistancsi particularly 
including the uses of evaluation for local 
program improvement and the strengthening of 
local evaluation capacity. 



Using Evaluation Results 



A frequently voiced statement about evaluation is that 
«>valuation findinge are rarely used. Often thia type of 
statement is followed by the criticism that few policies 
have been changed and few programs either terminated or 
started because of the findings from evaluation. 
Implicit in this criticism is a belief that "utilization" 
means direct and often immediate incorporation into 
policy and program. The criticism carries weight mainly 
for those who have a definition of utilization that comes 
close to making it a substitute for the political 
process. We do not take that position. In our view, 
utilization takes on a variety of forms, not all of them 
iitonediately evident. 

Indeed, we maintain that the main goal that evaluation 
can rightfully espouse is that of being "useful"! that 
is, evaluation-based knowledge is disseminated to those 
audiences that have -a need or an interest in it, is 
presented in a fashion that is understandable to them, 
and is addressed to the policy question's that are 
relevant to them. Evaluation cannot and should not 
substitute for the political process. . Hpr can evaluatprs 
ensure that evaluations are used. The best one can do is 
to make sure' that e^luation findings are available to 
tholl who might want them and that the findings address 
the iisues of concern in an understandable and 
responsible way. 

Because much of the difficulty with utilization 
centers around the differing meanings of that term, in 
the first two sections of this chapter we discuss the 
varieties of utilization^and some of the limitations that 
constrain the use of evaluation findings. Next w^ 
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summarize the evidence on how evaluations actually are 
used and show that considerable use Is made of evaluation 
results, even though evaluations rarely shape social 
policies In a sharp and Immediately obvious manner. The 
next section dlacuases the research literature on how 
aclence-based knowledge Is used and how Its use can be 
enhanced; the final section Identifies the various 
audiences for evaluation findings, their Information 
needs r and what the Department might do to better serve 
those needs. 



"Utilization** has been used to cover a variety of things, 
a semantic Imprecision that lies at the root of a common 
Impression that evaluation results are rarely 
"utilized." One major soujpce of difficulty lies In the 
failure to distinguish beiireen dissemination and 
utilization. Another major source of difficulty Is that 
"utilization" has *been used to mean overt changes In 
social policy and programs as well as uses of evaluation 
findings that fall far short of changing social policy. 



It has been recognized for some time that dissemination 
of knowledge does not necessarily lead to Its use, though 
It Is a requisite first step.^ For purposes of this 
report, dissemination of evaluation findings means the 
deliberate communication of knowledge derived from 
evaluation activities; utilization refers to the use of 
such knowledge when decisions are made about educational 
policies and' programs. Such use may Include Instituting 
a change as a result of having considered the evaluation- 
based knowledge. However, "dissemination" is often used - 
to mean or Imply utilization and subsequent changes that 
Is, utilization and change are viewed as an almost 
automatic by-product of communication. This use of 
"dissemination" la unfortunate and misleading because 
recent empirical studies on utilization and change make 
It clear that knowledge, however packaged and 
disseminated, has little compelling power In Its own 
right (see, for example, Caplan et al. 1975, Caplah 1980, 
Berman and McLaughlin 1975-78, Human Interaction Hesearch 
Institute 1976). These findings hold fbr 
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Dissemination and Utilization 
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^urpoae-specif ic information such as program evaluations 
as w«ll aa for forma of Knowledge for which the relation 
between Knowledge production and intended uae and 
audience is less obvious. 

The distiQctions between dissemination, utilisation, 
and change are important to keep in mind. Dissemination, 
because it is largely under the control of evaluators and 
sponsors, can be improved by self-Ksonscious efforts. 
Improvements in dissemination strategies can usually be 
m^ide that, other things being equal, ought to lead to 
greater utilisation and to change when indicated. But 
other things are generally not equal i the forces and 
events impinging on decisions about programs may be more 
powerful than evidence from evaluation activities* 
Moreover, such evidence is often couched in statistical 
terms that are not translated into terms having 
substantive meaning or that may not be substantively 
aignif ic^lnt.^ Steps can be taken to ensure wide and 
effective spread of information and thereby improve the 
likelihood of utilization, but we know of no means that 
can enau're utilization, let alone change. 



Forms of Utilization 

There is currently a very strong emphasis on using the 
results of evaluation for making specific decisions at a 
given time; for example, when legislative or budgetary 
decisions are anticipated or when changes in program 
regulatioiv or management are being considered. 
Sometimes, this perspective is appropriate, as was the 
case for the NIE compensatory education study. Which 
began with soma specific issues and fairly well-defined 
problems (Natiofdill Institute of Education 1976) and chose 
to investigate factors that could be controlled through 
changes in policy (Hill 1980). The desire of those who 
initiate and pay for evaluationa (Congread, the 
Department, state and local governments) to obtain 
immediately applicable results is understandable, but it 
can lead to inappropriate expectations. 

In particular , the grounds for decisions cannot always^ 
be specified beforehand. For example, funding decisions 
are sometimes declared to be the policy questions that 
the results of evaluations are to address. Yet funding 
decisions are generally made on a Variety of ^grounds, 
many of which cannot be addressed by evaluations, as has 
been amply demonstrated by the history of impact aid, 
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Head Start, Follow Through, bilingual education and other 
programs that became popular with beneficiaries and 
service deliverers. A program may develop such strong 
constituencies that the results of evaluations become 
largely Irrelevant to funding decisions. As another 
example, the evaluation of alternative compensatory 
education Interventions used In Follow Through was to 
Identify the most effective model for wide-scale 
implementation (Elmore 1975). It turned out, however, 
that there was more variation within models than between 
models; moreover, increased funding to permit increases 
in the program never materialized. 

The possible decision issues also change over time in 
unpredictable ways. Turnover among federal executives is 
high.^ Questions that are tied to the perspectives of 
an individual decision maker or of a particular 
administration may no longer be of interest when a new 
executive or administration takes over. Decisions also 
change as educational priorities change over two or three 
years, even under the same administration. 

In short, while evaluation for specific decisions 
appears to be a sensible strategy to follow, such a 
strategy may be much wasted effort. The issues involved 
in a decision that is to be taken at some time in the 
future are not easily predicted. Hence an evaluation 
started today that is directed towards the specific 
decisions envisaged two years hence is just as likely as 
not to miss the mark because the issues in the decisions 
will have changed. 

One implication of the above is that evaluations 
should seek out questions of lasting significance and , 
provide knowledge that can be used and reused, knowledge 
that may be exploited in several different ways over time 
in addition to furnishing short-term information 
(Chelimsky 1977). Involved here are differences in types 
of knowledge application, i .e. , knowledge for 
understanding versus knowledge for inunediate action # 
sometimes also referred to as conceptual ose (indirect 
impact on decision perspectives) versus instrumental use 
(direct, mechanical application) (Weiss 1977). To ensure 
the maximum utility of any major evaluation, it should 
address questions appropriate to both uses. Adopting 
this principle has consequences for the planning of 
evaluations (see Recommendation D-10, below). 

A third use of evaluation can be called 
legitimization: the primary purpose of the evaluation is 
something other than to develop knowledge about a 
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program. The reason for Initiating the study may be more 
important than the eventual results ^ such as meeting 
legal requirements for evaluation ^ demonstrating the 
objectivity of an agency's decision making, or supporting 
some particular point of view (e.g., the need for more 
program funds). Though such motives are not often 
overtly acknowledged, the use of ^ information that results 
from such evaluation studies is not necessarily 
illegitimate provided valid data are tepor^tcd and 
interpreted honestly. 



One of the problems in defining the process of 
utilization is that not all study results ought to be 
used and that deliberate rejection or nonuse of results 
that are faulty or otherwise inapplicable is preferable 
to misuse. Misapplication of results is as much a 
negative consequence of evaluation as lack of 
Application, and deliber«rte nonuse may represent rational 
decision making as much as does appropriate 
application.* The problem is that the deliberate 
nonuse after results have been carefully considered and 
dismissed for vdiiu reasons is difficult to distinguish 
from the failure to use evaluation results for other 
reasons. 

Aside from nonuse for valid reasons, it is important 
to distinguish between the misuse or nonuse that results 
from of lack of judgment and that which has as its 
motivation the suppression of valid information, jp^rsons 
who may not, be fully aware of the standards of quality 
that should be applied to evaluation studies may hail the 
results of faulted work and condemn on seemingly 
technical grounds quite well-executed studies. This lack 
of judgment calls for attempts to inform potential users 
of the standards by^ which various types of studies should 
be judged. The recommendations made elsewhere in this 
report on open and systematic: review of evaluation 
stddies should be helpful in judging quality. (Our 
recommendations on training in Chapters 3 and 5 are also 
intended to address this, problem.) 

Deliberate misuse or nonuse of evaluation studies is 
in many ways more difficult to deal with. First, it is 
difficult to detect motives. Second, it. is i}ot likely 
that persons deliberately abusing evaluation studies 
would be likely to be dissuaded by arguments based on 
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considerations of quality. The best that evaluatora and v 
the Department can do is to make sure through re' iew of 
evaluations that those that ace defective are clearly 
identified and that exemplary evaluations are also 
clearly identified. Pull publicity should be given to 
the evaluation review procedure and its results. 



Just as the definitions related to utilization are 
important to understand if one wants to improve the 
utilization process, so are the functions of knowledge 
within any agency or for individual decision makers, at 
whatever level. ^ Evaluation cannot and should not 
replace the political process. This means that an 
automatic translation of evaluation findings into policy 
decisions is neither desirable nor to be expected. 
Policy makers canhot override the ideological, political, 
and financial limits they face, though these limits are 
themselves subject to change over time, aided by the 
accumulation of knowledge. Decision makers and managers 
are not always able to take actions that seem to the 
researcher the "best" form of intervention or 
implementation. Both the feasibility and the 
acceptability of a change in public policy are as 
critical as science-baaed knowledge in determining the 
course of a decision (Ezrahi 1978) • Thus a program that 
is feasible and effective but likely to arouse the 
resistance of significant constituencies, or that can be 
funded only at the expense of gome other more desirable 
program, or that is liable to antagonize school 
administrators or teachers, is not likely to be adopted. 
Nor should it be, given that legislatures and public 
officials are expected to be responsive to such 
realities. There is no spiecial democratic license given 
to the results of evaluation that allows such results to 
override the ordinary political considerations that 
surround education just as they surround other important 
areas of social policy. 

So it is important' that, from the outset of^any 
evaluation, the rangeof optiohs and political realities 
regarding timihg, variables, and likely decisions be made 
clear by ttie likely users. Ea^ly collaboration between 
researchers and decision makers in planning the research, 
identifying varlablesi^ specifying time frames, and 
defining the problem under study will help toward wi^er 
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and more profitable use of social science research , 
especially program evaluation, within the political 
context of soci^^l probem solving (see Recommendations C-1 
and D->1) • 

: Though we use the term "decision maker" in this 
report, wc do nbt mean to imply that decisions about 
prograsis are made as if there were sovereign rulers in 
government, yet evaluation reports are often written as 
if such individuals existed and were able and ready to 
act on evaluation findings and recommendations. Aa we 
noted above, the persons who initially ordered and 
collaborated in planning evaluations and their 
utilisation may have moved on to other responsibilities 
by the time findings are available. Their successors 
often have less interest in or less understanding of the 
purpose of the evaluation. In addition, interests 
sometimes shift rapidly at the top echelops of government. 

Having some dociimentation of the purpose and 
importance of a stucSy th^t can be referred to after the 
authority for decisions has changed would help in 
utilization. However, as has become evident from 
research on organizations (see, for example, Cohen and 
March 1974, Cohen et al. 1972), policy is often not 
"made"^ rather, it accumulates by. slow accretion. New 
information may actually slow down the process since it 
may make decisions more complicated. Thus, one has to 
think of" policy formulation and decision making as 
involving different stages,, different people, and a 
process of absorbing and digesting all types of 
informations tested empirical findings from evaluations 
are only one of those types. 

While the reduction of ignorance may always be 
desirable, °it is not synonymous with the reduction of 
risk. In fact^ new information may produce considerable 
risks as it enters an organization. Perturbations go 
through the organization—^established assumptions and 
ways of doing things become threatened, agenda priorities 
and budget line items may be thrown into quesiTion, and so 
forth. The common response to such threats is to let 
procedure take jprecedence over substance and to ignore 
the message of the new information in the interest of 
preserving established procedures and structures. To the 
outsider', it may appear that the information is ignored, . 
though it may be used informally. Studies carried out on 
the utfe of knowledge among upper-level federal officials 
in the United States and abroad show that the control of 
information is more important than its use (Caplan 
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1980). The bureaucratic nature of state and local 
educational agencies has been amply* documented (Murphy 
1974); maintenance of the organization is also a priority 
goal. So, if knowledge use is tc be furthered, stress 
'must be placed on understanding bureaucratic rationality 
and on being non judgmental about it. It really is no 
less "correct** than individual or scientific rationality r 
but it is different and will deal differently with 
information * 



EVIDENCE ABOUT UTILIZATION 

To what extent is the impression correct that evaluation 
results in education are little used? Who does use 
evaluation results and who does not? The most 
comprehensive review addressing this topic consists of 
the recent cjise studies done by Leviton and Boruch (Ch. 6 
in Boruch a.td Cordray 1980) and the accompanying analysis 
of the existing literature on evaluation utilization. 
The aaaiysis/ which generally confirms the findings of 
earlier iLeae<?tch, is summarized below. 

^iznt, despite the difficulty of tracing utilization, 
there are a numbt^r of well-documented cases both at the 
national and at local levels in which evaluation findings 
were used directly in modifying laws or regulations, 
influencing choices of curricula or instructional 
strategies, or altering management pj^actice.. For 
example, of the 42 evaluation activities included in the 
section on use in the 1979 Annual Evaluation Report (U.S. 
Department of Health, Education, and Welfare 1979b), 
one-third were specifically cited in congressional 
documents or led to identifiable revisions in regulations 
and other management procedures.' 

Second, cases of conceptual use, ox contribution to 
the accumulation of knowledge about a program, are ^ 
obviously more difficult to verify. Nevertheless, there 
i^ evidence from interviews with congressional staff 
(Plorio 1980) and research on the behavior of federal 
executives ((^aplan et al. 1975) that some of the major 
sources of information (e.g., the Congressional Reference 
Service) used in Congress and by executive agencies are 
based on research evidence^ including evaluation 
findings. Often, such research-ba^sed information is used 
for framing issues, developing program ideas, and general 
oversight rather than foz^ immediate decision making. 
This type of knowledge use is not always apparent even to 
the user, let alone recognized by an outsider. 
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Third, in the last few years, the majority of 
evaluation studies have been concerned with 
implementation and managerial proqess-'-the type of study 
roost likely to lead t,o direct application, in this, 
•valuation is not- different froro other social science 
research; Caplan et al. (1975) found that more than half 
the use of social-science-based knowledge by federal 
executives was to increase adroinistrative efficiency and 
organizational control. The use of results froro prograro 
effect studies has been roore difficult to discern, and 
even when such studies are cited, it is not the findings 
on effects, but those on coverage and roan«geroent that are 
used. The evaluation study of the bilingual education 
prograro provides a good exarople (Danoff 1978). 

Fourth, a continuing problero in relation to 
utilization is the failure to spell out the ways in which 
the information developed by a study could be applied. 
What policy options appear pi;eferable to reach certain 
goals? vnia't management strategies deliver services 
effectively? . what are the outcomes of different 
curricula in different types of classrooms, for different- 
types of students? when evaluation studies 'address 
questions not perceived as iroportant by a particular 
audience, they are likely to consider the results 
irrelevant and useless. For exarople, a nurober of local 
^sites have reported that the data cequi red by the federal 
goverflfnent on Title I and other education prograros are 
not usefijJrto the local agency (David 1978) , while others 
consider such data useful but needing to be augmented by , 
specific local studies in order to gauge program progress 
(Boruch, Leviton, Cordray, and Pion, Ch. 6 in Boruch and 
Cordray 198,0). 

Fifth, . there .has been little attempt to specifically 
rieach audiehces concerned with equal educational 
opportunity. Women, minorities, ,^nd -handicapped people 
generally believe they. have limited :a[<icess to social 
science researchiand Evaluation processes that they see> 
as affecting programs that are significant to thero. 
Because of this perception of exclusion, soroe of the 
largest groups inVolved in equal opportunity issues, such 
as the NAACP, ASPjCRA, COSSMHO, the National Urban League, 
And the National Council c^t La.Raza, are developing their 
own capabilityfot research and developroent or have begun 
to vM:>rk closely yith research organizations willing and 
capable of addressing issues ot/interest £b roinority 
groups. The Council for Exceptional Children performs a 
similar function for prograros serving handicapped 
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children, as do women's organizations for programs of 
concern to them. As long as groups representing 
beneficiary interests see themselves as peripheral to the 
sharing of information produced by evaluation, there is 
likely to be unnecessary controversy and friction. 



The preceding sections have a tempted to define various 
types of knowledge use, discussed the setting or context 
for use, and briefly reviewed the evidence on the degree 
of use. Before considering what might be done to 
increase the use of evaluation results, we summarize what 
has been learned about the utilization of research 
knowledge in general. The research literature is 'replete 
with recommendations on how to improve the likelihood 
that knowledge will get transferred from producer to user 
and actually used (see, for exampler Havelock 1969, Davis 
1973, Glaser 1973, Havelock and Lingwood 1973, Rogers and 
Shoemaker 1971, Zaltman et al. 1973). Those 
recommendations tend to cluster around two sets of ^ 
factors: the nature of the information an4 hpw it Is 
communicated. 



The ways in which knowledge is produced and is perceived 
by its potential audience (s) affect its use. The 
important characteristics of knowledge associated with 
increased likelihood of use can be summarized a& 
intuitive correctness, objectivity, and relevance (Caplan 
1977). Obviously, there is not much that researchers can 
do to produce knowledge that fits the first 
characteristic, thjat seems to match common sense or to 
"feel right." However, intuitive correctness is probably 
most important only in the early stages of policy 
formulation r for needs assessment and for considering 
intervention possibilities. Perceptions of objectivity 
are usually^ enhanced by distancing evaluation from > 
program operations, but, as noted in^ Chapter 2, this may 
also make results less relevant for some audiences. The * 
reverse is true as well.^ Relevance involves 
continuous interaction between the primary audience and 
the researcher, although that may affect the researcher's 
objectivity. 



TOWARD INCREASED UTILIZATION 



Nature of the Information 
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There are several important elements in achieving 
relevance: 

• Negotiated content. Evaluators, sponsors, 
individualsi and groups comprising the primary 
audience (s) (if other than the sponsor) and action sites 
or program managers must negotiate what issues^and 
information needs can be addressed in terms of 
researchable qaestiona and^what types of data it will be 
possible to collect at program sites* Such negotiation 
is not a one-time-only task; it should proceed throughout 
the evaluation so that the study is not stymied or does 
not turn out to be irrelevant. 

• Appropriate research forms. Insofar as 
methodological limitations allow, the research should aim 
to use the policy maker's or primary user's definition of 
the problem. Re^searchers too often tend to define the 
research to fit methodologies rather than the interests 
of the likely audience. The law of instruments has a way 
of taking overt that which can be measured is measured, 
whether or not it addresses objectives or concerns of 
interest to the policy makers or program managers. 

• Realism. The research questions addressed and 

the interpretation of 'results must deal with^ options that 
are realistic for the decision makers expected to take 
action. The variables under study should be, ones that 
are politically roalleablet that is, they can be changed, 
if necessary, in order to improve policy or program 
substance. For example,, periods of reading instruction - 
can be lengthened, but a 111 studont/teacher ratio, even 
if effective in teaching reading, is unrealistic on a 
wide scala because of its cost. Implications and 
recommendations must take into account the constraints of 
likely users, such as political acceptability or budget 
limitations. 

• Timeliness, it is especially critical for direct- 
knowledge application that information be timely. ,if a 
study tcwprovide inpiit to" legislative or funding 
decisions, but is not geared to the authorization 
calendar or the budget cycle, it will be irrelevant to 
the primary audience(s). While what may be relevant 
today may not be relevant tomorrow, increased contact 
among parties at interest and evaluators will improve the 
probability that relevant questipns will be addressed. 

Attention to thes^ elements was a major factor, in the 
success of the NIE compensatory education study (Hill 
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1980) • And portions of effectiveness studies deemed 
relevant r namely those having to do with coverage and 
resource allocations, have been used In formulating 
legislative amendments, appropriations, and changes In 
regulation, even when the findings on effects appeared to 
be Ignored: for example, the histories and use made of 
the-siistalnlng effects study (Systems Development 
Corporation 1976) and the Title VII bilingual education 
study (Danoff 1978). (Citations In congressional 
documents of these studies and other documented uses are 
given In Boruch, Levlton, Cordray, and Plon, Ch. 6 In 
Boruch and Cordray (1980).) In Chapter 5 on the 
organization and management of evaluation activities, we 
make some reconunendatlons pertinent to Increasing the 
relevance of evaluation studies. Timeliness In 
particular and current Impediments to completing studies 
on time are treated at some length In Chapter 5 (and also 
In Chapter 3). We reiterate the .need for qulck-resjponse 
evaluation capability on part of the Department, as well 
as sophisticated planning of major evaluation tasks that 
will yield at least some useful results 'at the time they 
are needed by primary 'decision makers in Congress or at 
the top levelQ of the Department. 



Communication of the Information 

The many factors that have been identified in the 
literature as enhancing the transfer of knowledge and its 
use can be grouped under two headings: conununicabllity 
and linkage.^ Coromunicability encompasses matching the 
style of communication used by the researcher or other . 
transfer agent, (see below) to that of the primary 
audience(s). Since researchers are not necessarily the, 
most effective ccnununicators, nor will they always be on 
call when needed, linkage by means of transfer agents is 
necessary. 

> Several principles about conununicability have emerged 
from the literature at\6 successful practice: 

• Intelligible reports. Reports to primary 
audiences should be tailbred as much as possible to their 
needs and their situation (Patton et' al. 1977). Language 
should be understandable and situationally applicable; « 
e.g., papers ^r\d reports written for scholary audiences 
are rarely appropriate for the primarylor other 
audiences. Too often, social science researchers write 
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for their colleagues and, even when studying issues of 
pressing public concern, tend to emphasize the esoteric, 
counterintuitive, or paradoxical. Social scientists in 
the United States have a special fascination for numbers, 
but more emphasis should be given to the substantive 
meaning of evaluation findings, not to their numerical 
properties and the niceties of the statistical analyses. 
Reports should avoid jargon, be written in plain English, 
and address in a straightforward manner the issues 
relevant to the intended users and their informational 
needs. If a number of different audiences have primary 
interests, several versions (or translations) of a report 
may be necessary. 

Accentuating the positive, whenever possible, 
recommendations ought to highlight positive action steps 
that can be taken. Things not to do are important to 
recognize as well, but they rarely carry the same kind of 
reward for individuals in a position to act. 

• Live communication. ThS print medium is not the 
only nor even the most effective means of communication. 
Pace-to-face interaction and reporting through 
conferences provide alternative mechanisms. This allows 
clarifying questions and making sure th|it the mpst 
important points are covered, information is more. likely 
to be used when it comes from sources that are trusted, 
and human beings trust other human beings whom they have 
found to be reliable in the past more than they trust a 
computer terminal. Redundancy of communication has 
proven effective, so that optimal dissemination 
strategies^ are likely to include both oral and written 
coromunipation . . 

As we noted above, linkage, is the term used to. cover 
the gap that may exist between researchers and the 
audiences for their findings. Techniques to create 
linkage derive from research on communication and the 
spread of innovation (Katz and Lazarsfeld 1955, Rogers 
1962). Lippltt (^965) and Havelock and Lingwood (1973) 
single it out, as the most critical step. The issue is 
not just mechanisms of knowledge transfer]^ but 
information management, storage, retrieval, and knowledge 
synthesis. Past RD&D (research, develoEwnent, and 
diffusion) efforts by the Office of Education were 
premised on the assumption that knowledge i transfer and 
linkage through organizational arrangements would be , 
effective, but the examjple of the Congressional Reference 
Service shows the importance of people who act as the 
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translators or linkage agents. Experience with the 
Educational Resources Information Center (ERIC) also 
Indicates that a computerized system for storing and 
retrieving research Information worXs best when a live 
person acts as Intermediary between the questioner and 
the system. Linkage can be performed by In-house staff 
(for example. Individuals In the evaluation unit or In a 
separate dissemination component) or by parties external 
to either the research or the user communities. 

Some Important factors that affect linkage Include: 

• Responsiveness to differences. Transfer agent;.s 
or groups must be responsive to differences between 
researcher and audience and to , differences among 
audiences — perspectives, values, motivation, and 
language. They must know how to translate from one to 
the other and when c^lrect Interaction should take place 
and when not. {Fot example, some researchers make 
excellent congressional witnesses, others — equally 
eminent In their field — do not.) 

• Mediating problem definitions. Even at the 
beginning and during the course of a study, transfer 
agents can be useful because — speaking the language of 
both the researcher and the audience — they can help 
define policy decision problems In researchable terms. 
This role can be especially Important when the Intended 
user Is not the Immediate sponsor of the evaluation and 
therefore does not have automatic contact with the 
researcher. Problem definitions and criteria used by 
those requesting an evaluation must be understood by the - 
researcher and be a guide to what will be done In a 
study. They must also be clarified so as to be 
researchable, or the reasons they are not must be ; 
conveyed to those requesting the evaluation. (As we 
nested In Chapter 2, examples of unresearchable problems 
are the measurement of jeffects for 'i^lf fuse or broad-aim 
programs for wh-lch objectives cannot be specified, the 
measurement of the aggregate effects .of a program that 
takes different forms in thousands of different locales, 
or the effects of weak treatments administered in complex 
settings.) 

• Human agents. Linkage is best achieved by people 
rather than by cold-terminal (computerized) systems, 
although this may change as the computer culture becomes 
more pervasive and terminals become more accessible in 
location and in language. At present^ however, decision 
makers are still used to face-to-face communication for 
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most inportant tranaactiona, which only later get 
codified in print (Chelimaky 1977). 

• Open ayatema. Bureaucracies, including 
legislative and executive agencies at the federal, state, 
and local levels that deal with education, tend to be 
self-referential systemsi that is, people in 
bureaucracies loolc for information that comes from the 
inside and find it more credible. This characteristic is 
also true of other people in the evaluation process, such 
as the various interest groups. For example, teachers 
tend* to consult other teachers and their professional 
associations when they need information; groups 
representing minority interests have set up their own 
research components. It also appliea to knowledge ^ 
producers, .i.e., researchers, particularly those who are 
univeraity-based and are not dependent for their 
^ livelihood on communicating with potential sponsors of 
evaluations. Transfer agents can help make all these 
groups more aware of outside information* But to go 
Beyond awareness and expect linking or transfer agents to 
increase responsiveness to information would require them 
to understand the function of information in each group 
and the risks tHat the use of information entails for 
each.^ Transfer agents are not likely to be able to 
counteiraCt behavior based on maintaining cherished 
assumptions or' Well-established procedures and that 
therefore has a need to dgnore perturbing research 
findings. 



Recommendation- 1^9 The Department of Educatibri should 
test various mechanismsx for providing linkage between 
evaluators and potentiaj\uaers . ' 

The Department might consider^ establishing a unit 
char^^d with studying, developing', and instituting 
knowledge transfer, mechanisms and evaluating their 
effect£veness* Alternatively, outside experts might be 
charged, with this responsib}.lity». Appropriate activities 
of a linkage unit, whether within or outside the 
Department, would includes 

e Helping assess proposed dissemination plans for 
evaluation studies and suggesting improvements; 

• Performing needed translations of evaluation 
reports so that they can be understood by the intended 
audiences; 
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Funding research (in conjunction with the NIE 
dissemination research unit) on the access, transfer, 
communication, and utilization of evaluation information 
issuing from studies sponsored by the Department and 
elsewhere; 

• Developing effective techniques for the 
synthesis, stoirage, and retrieval of evaluation studies 
on a continuing basis; and 

• Developing and installing regular procedures and 
institutionalized arrangements designed to facilitate the 
use of evaluation data on a day*-to-day basis, at least 
within the Department. 



AUDIENCES FOR EVALUATION FINDINGS 



If *the main purpose of evaluations is to help develop 
more effective policy and improve education programs, who 
are the audiences that are likely to use evaluation 
results in this way? What kinds of information do they 
need? And how can evaluation planning be improved to 
better serve those needs? 

Conventionally, evaluations at the national level have 
been considered relevant to two primary audiences: 
policy makers in Congress and in the federal agency 
(i.e., the Department of Education) and federal pjrogram 
managers. In this simple view, policy makers would use 
the findings from evaluations to determine present and 
future program needs and directions, and managers would 
have a tool by which to improve the delivery of services 
mandated in programs. As evaluation results have become 
visible, however, it turned out that they have also 
served as ammunition for critics of controversial 
programs or as support for a program's advocates. 
Federal legislators, convinced of the importance of local 
decision making in education, have also been concerned 
with local use of evaluation results to improve programs 
within the local school system. 

Empirical evidence from studies of the use of 
evaluations (e.g., Boruch, Leviton, Cordray, and Pion, 
Ch. 6 in Boruch and Cordray 1980, Brickell 1974, Alkin et 
al. 1979) has shown that not all of those audiences can 
be served by any single overall study. The information 
needs of diverse audiences with varying and sometimes 
conflicting interests and perspectives make it virtually 
impossible for one evaluation study to satisfy them all. 
Policy makers may be mainly interested in coverage 
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iSBueSr p/ogram managers In efficient delivery, and 
recipients in issued of equal educational opportunity. 
Each of these interests requires a different approach, 
even different data collection. 

Perhaps the clearest example of the problems of 
diverse interests is the case of Title I evaluations 
(Wisler and Anderson 1979, Cross 1979, David 1978, 1980, 
Reisner 1980). The major evaluation strategy used sinca 
the inception of this program has been collection of data 
at the local level that, through aggregation at the state 
and national levels, was to serve the information needs 
of all three levels of government. The result has been 
the generation of large quantities of data that have not 
been useful at either the local or the national level — a 
costly and frustrating process leaving all parties 
dissatisfied. The failure of Title I evaluations had 
been blamed on the lack of competence at the local level 
to collect data that can be aggregated. While the 
competence of some local evaluation units may be an 
issue, the history of Title I evaluations illustrates a 
much deeper problem, namely, the confusion of evaluation 
purposes. The original intent of the congresslonally 
mandated local evaluations was to serve the needs of a 
local audience, defined by some to be the parents of poor 
children and by others to be the local school 
administrators and teachers. Later demands for assessing 
the overall effects of Title I spawned a complicated 
system of aggregating from the local to the state level 
and from the state to the national level. When it turned 
out that data emanating from thousands of different 
sources proved noncomparable. Congress mandated technical 
assistance to the local systems to help with procedures, 
designs, measures used, and problems encountered at the 
local level. Models for evaluation designs were 
developed and the technical assistance centers were 
created to instruct local evaluators in proper use of the 
models. Yearly costs for this assistance system now 
stand at $12 million, more than half the budget of the 
central evaluation unit. And yet complaints about the 
utility of Title I, evaluation information continue. 
Local school systems find the data they are required to 
collect by federal directive of little use to them and, 
if they have the resources and the competence, they 
conduct their own program improvement studies. At the 
national level. Congress has consistently expressed its 
dissatisfaction with the information it receives, as 
evidenced by the rewriting of the evaluation requirements 
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for Title I that has occurred in every reauthorization of 
the program. Congress finally resorted to conuniasioning 
ita own study, which waa carried out by the National 
Institute of Education, a unit that was independent of 
the Office of Education (P.L. 93-3B0, Section B21) . 
Leviton and Boruch (Ch. 6 in Boruch and Cordray 19 BO) 
summarize the evidence on the usefulness of the NIE study 
to its audience, citing specific changes in law and 
regulations in six major program areas directly traceable 
to study findings. Much of the success of this study as 
contrasted to all the other Title I evaluations is 
explained by its director (Hill 19B0) as due to the 
extensive consultiv|:ion with the primary audience, 
Congress. 

To increase the probability that results will be used, 
the plana for an evaluation should spell out who the 
primary audiences are likely to be and how it is planne(J 
to reach them, so that both the substantive issues and 
the dissemination strategies can be negotiated with 
them. However, there will often be a number of secondary 
audiences. For example, an evaluation concerned with 
testing alternative curricula in career education to 
facilitate local choice may also affect the regulations 
governing federally supported vocational education 
programs, tor evaluations conducted at the national 
level, decision makers (within the agency and Congress) 
and managers at the federal level are likely to take 
precedence. But where federal funds are made available 
for state and local evaluations, needs at those levels 
should be served. 



Although planning does not necessarily lead to an agenda 
that is subsequently carried out in detail, the act of 
planning always leads to an improved sense of priorities, 
provides a forum in which competing interests can reach 
accommodations, and induces an active as opposed to a 
reactive stance toward essential activities. 



The Role of Planning 
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Raoomroandation D-10» Th» Department od JBduoation should 
Inatitute a flexible planning Byatcm for cvaluationB of 
federal eduoetion programs . (See Heoommendation D-1,) 

A flexible and workable planning system must have 
several attributes. First r it ought to provide for 
appropriate information for the predictably recurring 
legislative cycles on education programs. This entails a 
standard sequence of studies — timed to be available for 
reauthorisation and appropriation hearings— that will 
furnish information on the coverage of pjcograms, 
descriptions of how they are run, and a synthesis of 
information available at any given time of what can be 
said about their effects. Second, there must be an 
ongoing program of evaluation studies carried out at the 
deliberative pace required to adddreas problems that are 
poorly understood. Third, the Department must have the 
ability to respond to interesting questions that arise as 
a result of ongoing research , changes in policy, or 
develoDment of new programs. 

In the past, the central evaluation unit of the 
Department has'' concentrated resources on massive studies, 
in part because such studies require fewer procurement 
actions to allocate available funds. But big studies 
invariably take longer than anticipated and become highly 
inflexible; hence they often end up addressing matters'of 
tangential interest to the audience at hand when they are 
finally completed. Any evaluation plan for a major 
education program should contain a series of linked 
studies, some of which furnish factual information that 
can be obtained in reasonably short time and some of 
which address issues of long-term interest. Thus, at any 
particular time and especially at predictably recurring 
decision stages, one or more additional sets of findings 
about a program will be available. Additionally, the 
value of the whole evaluation plan does not depend on the 
success or failure of a single massive study or on the 
performance of a single contractor; there will always be 
some Useful studies resulting from the overall plan, eVen 
though some may not turn out as hoped. In addition to 
the plan for the NIE study of Title I, examples of such 
evaluation planning are the original ]5lan to evaluate the 
Education for All Handicapped Children Act (U.S. 
Department of Health, Education, ^and Weif/ire h.d.) and 
the Department's new evaluation plan for Title I of ESEA 
developed in 1979 (UiS. Department of Health, Education, 
and Welfare 1979c) .^^ The Committee applauds the 
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D«partin«nt*B direction In thla reapeot and bellevea that 
it will help make the Department's atudiea more relevant 
to the immediate conoerna of deoieion makers and 
departmental managers. Before any ooatly evaluation 
study is undertaken^ however, ways in which it can inform 
decisions and the risks of the evaluation questions 
changing during the course of the study should be 
outlined through the type of evaluability assessment 
described in Chapter 2 or through some similar process. 

The absence of a reasonable planning system in the 
Department has had two deleterious consequences*^^. 
First, it has given rise to an emphasis on activities for 
"putting out the fire*^ — projects done in response to an 
immediate crisis because no suitable information was at ' 
hand when the question arose* Not infrequently, such 
projects are irrelevant by the time they are completed, 
either because the crisis has subsided or a different one 
has arisen and attention has shifted* The emphasis on 
addressing immediate concerns has reduced the 
Department's ability to evaluate programs on a recurrent 
basis in a fashion that would cumulate evidence on their 
Implementation and effectiveness over time* Studies to 
develop and test out more effective program alternatives 
receive even shorter shrift* 

The second effect of the absence of appropriate 
planning has been to create yearly uncertainty, beyond ' 
that created by the budget process, about what studies 
the Department will undertake* When yearly planning is 
not set in the context of approved ongoing plans, the 
approval process takes longer than necessary and may be 
subject to capricious and arbitrary decisions* The 
history of fiscal 1980, when it took 6-9 months to obtain 
approval for initiating a study, provides a vivid example* 



Recommendation D-11* The Department of Education should 
establish a quick-response capability to address critical 
but unanticipated evaluation questions . 

No matter how flexible the planning system, there wiXl 
be a continuing need to respond quickly (within a 2- to 
6-month time frame) to evaluation-related questions that 
come from the Congress or from top-level Department 
officials* Department staff charged with evaluation 
responsibilities must be in a position to deal with such 
requests* In some areas, in-house expertise may exist, 
but even under the best of circumstances such expertise 
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cannot b« exptottd to oov«r the great variety of topics 
that may surface at various tiimes. Several extramural 
mechanisms are available for a quick-response capability! 

• Lists of contractors can be maintained who^ as a 
result of being found qualified in epecified areas 
through the RFQ process, can be awarded small contracts 
within days for work that is limited in scope and time. 
This mechanism in the form of basic ordering agreements 
has been used by the Assistant Bepretary for Planning and 
Evaluation (A8PB) in the former HEWi the dollar limit on 
contracts was 160,000. 

e Highly qualified selected organisations can be 
awarded contracts that pay for a given number of 
person-hours of effort, with tasks to be specified as the 
need arises. This mechanism has been used in the 
Department of Labor, with;the limit for any one-year 
contract set at $200,000. 

• 8-A contracts and awards to SBA-eligible firms 
can usually be executed more quickly than other types of 
contracts. 

In order to be fully responsive to the information needs 
of its primary audiences, the Department must be able to 
combine a deliberative planning process that allows time 
forrfield and constituency involvement with a 
quick-response capability that can address unanticipated 
but critical evaluation questions as they arise. 

The need to serve short-term information requests can 
be considerably enhanced in any program by the 
development of good management information systems. 
Thus, for example, if a good management information 
system had been in place, it should have been possible 
for the Spanish/English bilingual education program 
(Title VII) to have provided Congress with detailed 
information on the ethnicity and language status of the 
students being served. Instead, a study intended to 
assess the impact of the program had to use a 
considerable share of its resources for documenting 
program coverage (Danoff 1978) • Similarly, such 
questions as the trends in composition over time of 
students enrolled in education courses in colleges and 
universities ought to be routinely collected as useful 
and necessary background data on the future supply (over 
or under) of teachers. 

For many programs that are not funded through the 
Department, the provision of such management information 
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may be UiCCioult to the point of impoasibUity . Dut for 
federal programs^ the Department ahould conalder the 
poB^ibility that good management information ayatema may 
provide much of the information that may be required 
about a program for many deoialon«*making purposon. Such 
systems must be carefully deaigned^ however^ to provide 
information that is likely to be useful^ rather than 
trying to cover all contingencies t As we note in Chapter 
5 belowr grantee reports have too often been collocted 
without ever being reviewed. 



The discussion of diffsrent audiences for evaluation 
results that follows tries to indicate different 
information needs for each. Two facts should be notudi 
there are important distinctions within broad classei«i of 
potential users or audiences ^ and apcmsore are sometimes 
but not always synonymous with primary audiences. The 
latter fact means that the process of negotiating 
research questions and other substantive i'isu«9S may litave 
to involve a number of parties. 



Executive Policy Staff 

This category includes individuals with authority ov >r 
resource allocations and the design of progri is, most 
importantly, senior-level agency officials and their 
analytical staffs and budget examiners in tho Office of 
Management and Budget (OMB) . It is rare, ii: ever, thf^t 
these officials are waiting for evaluation study resu3.ts 
in order to make up their minds on what policies to 
pursue or what programs to fund. The weight of an 
evaluation may be slight in comparison to the 
constellation of interests and other reasone for deciding 
one way or another, even in ways counter indicated by an 
evaluation study. 

The temptations to misuse; or not use \he re^jults of 
evaluation studies are all too clear; hence hhc 
importance we place in this and other chapters on the 
obligation of evaluators to release findings 
independently of executive decision makers. These 
temptations are also the reason (as we indicated in 
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Primary Audiences for National Evaluations 
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Chapur 3) tot rtoonunendlng that all tvaluation atudlaa 
ba aubjaot to ravlaw^ tha raaulta of which ara mada 
p^iblio (aaa Baoommandation D«"6). 

Ona of tha problama in tha utiliiatlon of avaluatlon 
raaulta la that flndinga may not ba diaaaminatad to all 
pacaona involvad in making daolalona at tha axaoutlva 
laval. Thia la of tan trua foe 0MB ataffi who ara 
ganarally not in tha "loop'' of paopla who normally 
raQaiva avaluation rapofta, «o thair information naada 
may ba aarvad inadaquataly. In addition, turnovar of 
top-laval aganoy off ioiala in aduoation haa aggravated 
tha problam of loaa of information and inatltutional 
mamory. On tha othar hand, agancy off ioiala have tha 
advantage of being able to draw on their policy and 
evaluation ataffai who are probably the moat conaistent 
uaera of evaluation data while alao being the liKely 
immediate aponaora of evaluationa. 

The potentially abort life of evaluation findings, 
even though the knowledge might be useful at a later time 
and in a'different context, means that dissemination 
should not be just a one-time effort. Archived 
evaluation studies that are difficult to obtain and whose 
existence Is difficult to determine are useless. Hence 
some attention should be given to the problem of 
re-*dissemination of evaluation findingsf perhaps in the^ 
form of summaries or reviews of past evaluation findings 
for executive-*level officials as programs and policies 
come up for review. 



Congressional Policy Makers 



It is a mistake not to differentiate among congressional 
users of information. Rarely are members of Congress 
direct and immediate audiences. Rather, the initial 
contacts are more often with the Congressional Research 
Service (CRS) staff, committee staff, or personal staff 
of members of Congress, in addition, staff of the 
Congressional Budget office and of GAO are frequently 
prime audiences for evaluation studies. CRS, as part of 
the Library of Congress, functions as a quick reference 
service for both members and committees of Congress; GAO 
carries out special studies at the behest of 
Congress. Congressional staff themselves differ in 
their use of evaluation" information! senior staff of 
committees are generally better informed users of 
evaluation results than personal staff of individual 
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m4imb«r»f who t«nd to b« junior » mnt covtir a muph bronclAr 
ranQO of lasu«af and must qanerally Clnd avldanoa to 
iupport a mombar*a view* Thara are aXao diCCaranoaa 
among typaa ot oommittaaBi anthorlaation aommlttaaa tand 
to oita avaluatlon data mora fi'aquantXy than 
appropriatlona conunlttaaa (aae buruohi Lavltoni Cordray^ 
and PloHf Ch. 6il2~iea In Qoruoh and Cordray 1980)---* 
prooCi perhapBi of tha faot that budgetary daolalona 

often are not heavily Influenoed by the reaulta of 
program evaluation. * 

It la relatively eaay to document the explloit uae of 
evaluation atudlea by Congreaa and ita ataff i who makea 
what Information requeata and reaelvea reaponaea from 
CRBi who haa received coulee of evaluation atudlesf and 
who refera explicitly to thoae studies In committee 
reports and In the published remarks of membar^ of 
Congress. But there Is also a more Informal and diffuse 
Infiltration of Information Into congressional discourse 
that Is much more difficult to trace because It leaves no 
explicit markers. Thus, a Congreaswoman who remarks on 
the floor that a particular program Is working well may 
mean that she has talked to a school principal In her 
district who assured her that without the program his 
schools would be suffer Ing, or she may mean that she has 
received a memo from one of her staff who had summarized 
an evaluation report from the Department of Educatloni or 
she may be referring to an assessment from GAO| or she 
may merely be expressing her own opinion based upon 
whether or not the program Is "In line** with the kinds of 
things she usually supports. We auapecti along with 
others, that this Informal, diffuse use of evaluation 
results may be the most Important use of all, but It Is 
not something for which one can readily provide direct 
documentation. 



Federal Program Managers 

Program managers are likely to be Interested In 
Information that can Improve delivery of educational 
services at the local levels. Since they are often 
already committed to a given program, effectiveness 
Information may seem irrelevant to them except Insofar as 
It enhances support for the program. . On the other hand. 
Information on how programs are being implemented and 
what services are being provided to what beneficiaries 
can lead to Improvement In program regulation and 
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in«n«Qamant« HQW«v«rf 1( thi ghang^a suggtaUd by 
findlnga of procaaa avaluatlona are too diacuptlva of 
tatabllahad prooaduraa, thay aca not Ukaly to ba 
implamantad. 



Raooromandation p^'Xa. Vha Dapactmant of gduoation ahould 
anaura that ayaluatlona daal with toploa that fira 
ralavant to tha likaly uaara , <aaa Raoommendatlona c-1 
and D-1.) 

* Aa diaouaaad aarXiac^ ralavanoa ia not aaay to 
achiavaf but it ia calativaXy aaay to apaoify pcooaduraa 
that will maka it mora likaXy, such prooaduraa includai 

* Primary audianca(a) muat ba apacifiad from tha 
beginning of tha atudy. 

* Arrangamanta muat ba mada to facilitate 
communication between avaXuatora and intended uaara at 
tha inception of a atudy and throughout ita courae. Thia 
wiXX help enaure the fideXity of the evaluatioa to the 
queationa of intereat to the identified audience (a) and 
will alao help obtain commitment and intereat on their 
part. Current adminiatrative reatrictiona that inhibit 
that kind of communication ahould be removed. 

* When the goal of an evaluation ia to provide 
information for deciaiona at apecified timesi auoh aa the 
reauthoriiation of programs or annual program 
appropriations, reports must be delivered on time. If a 
study has been delayed, its abortion should be considered 
Unless some aspects will address longer-range concerns. 

* Evaluation monitors should ))e charged with the 
responsibility of including in their routine monitoring 
information about events and changes that carry 
implications for the usability of findings. Changes in 
evaluation design or methodology are sometimes made in 
reaponse tO' field conditions, budgetary and clearance 
conatraints, or for other reaaons. Such changes may have 
sufficient impact on a study so that the research 
questions framed to be relevant to the identified 
audience (s) can no longer be addressed adequately. 
Changes in^ the conduct of an evaluation that have such 
impact on the possibility of utiliaation ahould- suggest 
rethinking the objectives of the evaluation or 
terminating it altogether. 



133 



Othor AU(U«na«o aIao hAvo n otAke in ftdorAl odUQAtion 
pragrAmni And thtrciCorA in ovAlUAtionA qC thAmr «VAn it 
fcho quiAtionA AddrAAAAd hAVA bAAn PrAmAd by thA QonQArnA 

Ot HAdArAl ItgiAlAtorA or AKAOUtlVAHt Of! OOUtAAi AomA 

AtuditA donA At thA nAtionAl IavaI ApAaiCioAlly 
AddrAAA thA InCormAtlon nAAdA ot a nonTAdAtAl AudlAnoA^ 
tot AMAmplA, rAprAAAntAtlvAH of minority And othAr 
bAnAfioiAry groupAt fot AtudlAA InltiAtAd by or At the 
bAhAAt oC Any of thAAA othvr AUdiAnoAA, our 
olAAAif ioation of prih^ary and AAoondAry AudlAnoAA wouldr 

of OOUrAAr bA rAVATAAdt 



StatA and Local AgenoiABi CAntral 8taff 

ThA diatinotiona made at the faderal IavaI among dAoiaion 
makera, evaluation (and other analytioal) staff* and 
program managers are also important at the state and 
local levels. The motivations and general information 
needs of the staffs are analogous, but focused on the 
program aa it operates in the looAl setting. Since the 
policy variables that can be altered by state and local 
.administrators are cohsiderably different from those that 
can be altered by federal staff and Congressi evaluations 
muat address different questions. Similarly i program 
management at the federal level entails quite different 
responsibilities from program management at the state and 
local levels, and process evaluations tbat are intended 
to improve management ritust be sensitive to these 
differences. < 

Local Agenciesi Principals and Teachers 

The individuals who actually provide the educational 
services intended by a program (and their 
representatives, such as the National Education 
AG^BOciation (NEA) , the Americah Federation of Teachers 
(AFT) # and associations representing school principals) 
can become a powerful constituency for or against a 
prograp^ as has been demonstrated by the history of Head 
Start and thei experiments with voucher programs. 
Evaluations can be threatening or supportive — threatening 
if they appear to sugges^ a reduction in a prqgram viewed 
as useful, supportive if they offer help' to teachers in 



doing « b^tUr job with a prqgrAmi U pMrpoaa q( ah 
«v«lu«tiion in to do fch« l«tfe«r, than it n^ucit Addr^ifffii 
program tUmtntfl thut Ar« Mndtr thii oontrol q« 'tuAeh^ra 
QC prinalpiilii, ror vMAinpUi atmonfttriting dl««r«nti«l 
•ectota of A program Coc di(!€«r«nt population groupA iA 
not hAlpfui to tAAohAra or pcinoipAlA Ainoi nAithAr QAn 
AAlAOt whom thAy will tAAoh. HowAVAr* dAinonAtrAting 
diCCArAntlAl AfCAotA ot AltArnatiVA program AtratAglAA 
mAy bA hAlpCul, AlnoA tAAohArA can AAiAot thA Atratagy 
moat approprlatA to thair aohool aituation and atudanta. 



Program CllantA and Thair RaprAAAntatiVAA 

ThA ultimata targata ot aduoation programa ara AtudAnta, 
SinoA much ot tha invAAtmAnt in CAdAral aduoation 
pro^jrama ia at tha alamantary IavaIi obviouAly mAny of 
thA bAnAfioiariaa ara too young to ba audianoaa for 
avaluation information. Kowevari thara hava baan 
apaoifio attampta to addraaa avaluationa to paranta ao 
that thay oould uaa tha raaulta to improva thAir 
ohildrAn»a aohooling* Aa wa notad abovAi thia waa the 
axplioit intant of tha original Titla I avaluation 
mandata (tha first lagislatad raquiramant for avaluation 
in aduoation) ^a originally propoaad by Sanator Robart 
annady in 1965 (David 1978). Tha objaotiva has saldom 
baan mat^ even whan parent advioa was legislated into 
later Title Z amendments in the form of parent advisory 
oounoils. Groups other than parents also sp^ak for the 
interests of benef ioiariea^ most of whom are poor, 
members of minority groups, handicapped, or otherwise the 
targets of discrimination. The interests of these 
groups, which include the major advocacy organizations 
concerned with equal opportunity and minority issues, is 
to use evaluation information to ensure that the intended 
beneficiaries are adequately reached by the programs 
intended to serve them and that those programs deliver 
effective services. 



Researchers 

The outcomes of any evaluation study will be of interest 
to other evaluators and researchers who are concerned 
with development of educational policy, with 
instructional strategies and school management, and with 
the technical issues arising In the conduct of applied 
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^vAluAtPr» And r9ii«ArQhi»r» ahoMld h^v^ ^i^hy p^oQi^nn U) 
ttvtiluatlQn repapt«i In rt^ldltlon, primary dAta ahould 
4liia ht) avAiUhU rci^tiaroh^r^ m that m^Qondary 
an«ly»«!i and proatt-nvalMafciPn analya^s pan b« parried 
oufc, Th« Impprfcanp^ pE praviding fpr aaaondary raa«arPh 
la daim^natratod by tha Qaok at alt iW^) raanalyaU ot 
tho **aeaamo atraaf^ ovaluatlpa that ahawad that^ although 
tha targat pppwlatlon^^ppor ohlldian'^'^had Indaad mada 
<jalna In raadlng raadlnaaai aa dopuinantad hy tha original 
avalMatlona, tha gap batwaan tham and mora aEUluant 
ohlldran had aptually grown baaauaa tha lattar mada 
graatar laarnlng galna, in ordar to pravlda ^or 
aaoondary raaaarohr raporta and primary data and 
publlaatlon ot avaluatlon-ralatod matarlal ahauld b» 
.uohlvad In profaaaionaL Journala and aa monogVapha (aaa 
^laoomma^datlon ^ 



Madia 

Dlaouaaiona of! ayaluationa ara mora llKaly to find thalr 
way into prolfesaional and trade journaXa if raaulta turn 
out to ba QontrovaraiaX. If the program being evaluated 
ia itaeXC of auffioient interaatr the oontrovarsiea are 
likely to be picked up by the more popular media » 
newapapersr televisionr and radio. ObviouBly.^ theae are 
secondary audiencea for evaluation results ^ but the way 
in which evaluators conununicate with th^ may make a 
crucial difference in the reporting and interpretation of 
what a program is all about and what evaluation is all 
about. ^ 



Reaching Audiences 

Recommendation D-13, The Department Of Education should 
ensure t;hatw dissemination of evaluation results achieves 
adequate cove rag 




Evaluation/utilization^ has been asiiiigned a high 
priority within this Department^ but utilization cannot 
happen uYrtns people have a chance to cooaider relevant 
information. Therefore » it is important/ to establish 
clearly that attention to dissemination is not a pro 
forma exercise. Indeed » the agency must, through its 
actions r indicate as great a commitment to dissemination 
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concerns «■ to r«3arch design, masurementf and 
"ftnalytical proceduraa. Staf f ^ho prepare RFPs and 
iBonitoir evaluations and external contractors or grantees 
must both understand that attention to dissemination is 
not just a "boilerplate" requirement, but that 
dissemination plans will be subjected to the same 
scrutiny and assessment as are evaluation designs and 
methodology. 

At the very least, evaluation results must be 
coMMMinioated (delivered) to the primary audience(s). 
Thii requirement would seem self«»evident, but it often is 
not met. Contract clauses routinely forbid dissemination 
before formal approval by the sponsor, which is sometimes 
withheld* As Boruch, Cordray, and Fion note (Ch. 5 in 
ioruch and Cordray 1980), this keeps some (though not 
all) evaluators from reporting on the|r findings. Also 
routinely, a very limited number of copies of final 
reports are printed (100 copies for most studies unless 
unusual circumstances exist), with the result that 
landmark studies like the Title VII bilingual education 
study (Danoff 1978) quickly become out of print. In some 
cases, a copy of the final report cannot even be found in 
the project files (Cook and Cruder 1979). In other 
cases, like that of the NIB oo^npitonsatory education study 
(National Institute of Education 1977), a stockpile of 
copies actually exists, but it is difficult to get 
information about how to get copies. In cases of lengthy 
reports with multiple appendices, archives like ERIC 
contain only part of the material originally published. 
Restrictions on the nu^ar of copies and on archives—not 
to mention more costly dissemination strategies — are 
often imposed by contracting rather than technical agency 
staff in order to reduce budgets but without 
consideration of dissemination needs. 

All RFPs and grant announcements should include 
requirements for a dissemination plan that is oriented 
toward maximising the likelihood of utilization. The 
evaluation of proposals should give appropriate weight to 
the quality of ^he dissemination mechanisms proposed. 
Budget negotiations should recognize that adequate 
Hissemination is costly and cannot be an afterthought. 
Dissemination plans should include t 

• Specification of primary and secondary audiences; 

• Delineation of the different information needs of 
the specified audiences and how those needs will be 
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served, such as different types of reports including more 
or less technical material; 

• Provision for an adequate number of copies of 
reports and other salient material to be distributed to 
the specified audiences i 

• Strategies for reaching audiences through means 
other than printed reports, e.g., conferences, throughout 
the course of the study; 

• Specification of timetable events, e.g., 
congressional hearings, that provide occasion for 
reporting on findings; 

• Mechanisms for reviewing and revising the 
dissemination plan during the course of a study to take 
account of changes In the study or In the context of the 
work; 

• Plans for archiving reports and other 
documentatlbn of findings so that they remain accessible, 
with a guarantee by the contractor that data will be 
clean and accessible (see Recommendation D-7); and 

• A budget commensurate with the proposed 
dissemination activities* 



Recommendation d-14. The Department of Education should 
observe the rights of any parties at Interest and the 
public In general to Information generated about public 
programs * 

Though minimal dissemination Is concerned primarily 
with the Immediate or primary audience, other people 
having an Interest In the program being studied are 
likely to demand and should have access to evaluation 
findings. This raises two issues t What are the special 
rights, if any, that should be afforded the agency that 
has requested and funded an evaluation, e*g«, the 
Congress, the Department, OMS, or GAO? To what degree 
should traditional authority relationships be overridden 
in order to serve the public interest, i.e., what 
obligations do evaluation units and contractors have to 
disseminate findings to potential users who are outside 
the command and report lines within tables of 
organization? 

. Findings from evaluations must be made available to 
those who are importantly affected by the programs being 
evaluated for example, those who manage them, those who 
provide program services, and those who are intended to 
benefit (or their representatives) . Since evaluations 



138 



127 



are paid for with public funds, they should also be made 
available to the public at large. The Committee is aware 
of the dangers in providing too much autonomy to 
evaluation units and contractors, but public interest 
needs suggest that, at the dissemination stage, 
evaluators should be guaranteed a certain degree of 
autonomy. 

Pour steps are needed to provide improved public 
access to evaluation findings t 

s Proper safeguards for maintaining the rights to 
privacy of individuals and organisations must be applied 
before release of findings; 

• The rights of the sponsoring authority to 
exclusive access to evaluation results should be limited 
in time; 

e The right of managers and executives to restrict, 
control, or suppress evaluation findings should be 
limited in time; and 



/ e Reports on findings should be accompanied, when 
available, by interpretations and critiques issuing from 
the review process recommended in Chapter 3. 

Appropriate changes should be made in contract provisions 
to allow contractors and grantees the necessary 
flexibility with regard to distribution of reports and 
other dissemination strategies. 



Recommendation P'>1S. The Department of Education should 
give attention to the identification of "right-to-know** 
user audiences and develop strategies to meet their 
information needs . 

Perhaps the most neglected audience for evaluation 
studies consists of program beneficiaries and their 
representatives. We recognize that this neglect is not 
so much intentional as it is produced by the very real 
difficulties of defining this set of^audiences in a 
reasonable way. In order to more closely approximate the 
ideal that all those having a recognized interest in a 
program should have reasonable access to evaluation 
results, the Department should consider dissemination of 
evaluation reports freely to groups and organizations 
that claim to represent major classes of beneficiaries of 
education programs. Positive, active dissemination to 
such right-to-know groups may include such specific 
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activities as ascertaining their information needs prior, 
to evaluation design and during the evaluation, preparing 
standard lists of groups and organizations to whom 
evaluation results are disseminated rgutinely, and 
seeking out comments and critiques of evaluation reports. 

Since it is to be expected that such right-to*know 
groups will be different for different evaluations, 
careful consideration of the appropriate right-to-know 
groups should be part of the dissemination plans that 
contractors are asked to prepare as part of their 
response to RFTs and grant announcements. 

We recognize that this recommendation makes the whole 
process olf sponsoring and carrying out evaluations more 
complex, but we consider the involvement of ri9|it--to-know 
groups critical. They often perceive themselves as 
having limited access to or insignificant involvement in 
evaluation efforts that may be used for policy and 
resource allocation decisions that concern them. 
Furthermore, such groups can have an important influence 
on the improvement of educational practice, and they need 
access to information so that their recommendations and 
actions are as effective as possible. Involvement of 
these audiences from the very outset of an evaluation 
enriches the public policy process both because it widens 
the universe of viewpoints and because, over the long 
term, it can improve the quality of education insofar as 
these groups are links to the communities that the 
government is attempting to serve* If they share in the 
evaluation process from the beginning, they are more 
likely to use the findings in their spheres of influence. 



Changing User Behavior 

Recently Sechrest (1980) has suggested that, if 
high-'level administrators could be trained in how 
evaluations are done and how researchers present results, 
utilization would be increased. We include suggestions 
for such training in Recommendation 0-17 in Chapter 5. 
We have some doubt, however, that top executives or 
members of Congress have the time for such training or 
would retain technical knowledge that they would use 
infrequently. If they did develop greater facility for 
the language of evaluation, they would certainly become 
more sophisticated readers. 

It is possible to think of incentives i'or use and 
sanctions against failure to use evaluation results 
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within the lower echelons o£ federal and local program 
management* For example, program managers and program 
personnel might be required to respond to evaluations 
ifith appraisals and critiques, to provide plans for 
incorpoireting valid findings into their program 
operations, and to document subsequently whether the 
planned changes had been made. Some states (Rhode 
Island, Massachusetts) do indeed require reports from 
local school systems on the use of Title I evaluations. 
However, there is also some danger that such requirements 
will turn into additional pro forma exercises. Required 
responses and actions might also maKe explicit some 
conflicts between managers ahd analytical staff about the 
value of a program or the effectiveness of its management. 

Recent reforms in the federal civil service provide 
special bonuses for effective program management, and 
appraisal of management is tied to the results of program 
evaluation (Office of Management and Budget 1979) . 
However, the success or failure of a program is at least 
as much dependent on its design and legislative 
provisions as it is on the efforts of progrM managers 
and personnel, so the attempt to judge good management 
performance through program evaluation may be off target 
unless only those factors under control of the program 
manager are examined. A second effect of this particular 
incentive system has been to define management objectives 
in clearly measurable terms (e.g., itmB of priority mail 
answered on time) rather than in terms of the more subtle 
and less objectively measurable behaviorb that are needed 
for effective program management, such as frequent and 
productive interaction with state and local staff. 

Sanctions for failure to institute changes suggested 
by evaluation results have also been suggested, for 
example, withholding program funds until the changes are 
made. The history of cutoff of federal funds for 
violation of civil rights laws suggests that this 
particular sanction is very unlikely to be imposed. 
Consequently, we make no explicit recommendation on the 
use of incentives or sanctions. However, the Department 
might consider requesting that federal program managers 
who have had their programs evaluated prepare evaluation 
use reports. These might be prepared within one year 
following receipt of the evaluation report and contain an 
assessment of the level and types of uses made (including 
reasons for nonuse) as well as an analysis of factors 
that impeded or facilitated use. If the Department 
proceeds with such a requirement, the dissemination and 
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linkage unit proposed above should be charged with the 
additional responsibility of assessing whether drawing 
the attention of program managers to evaluation 
information in this manner actually improves its chances 
for use* ^ 



1 The literature on putting knowledge to use has grown 
as rapidly as the evaluation field itself. Davis (in 
Human Interaction Research Institute 1976) has 
estimated that, by the mid-19708, the research 
literature concerned with the field of knowledge 
utilization included some 20,000 citations, compared 
with 400 such citations 20 years earlier* 

2 For example, Marsh et al. (in press) found that 
changes in rape law had produced a statistically 
significant decrease from 12 to 10 in the average 
number of examination procedures that a rape victim 
had to undergo if she reported the crime. Obviously, 
in substantive terms of victim humiliation, one could 
hardly report this as a meaningful change. 

3 The average tenure of a Commissioner of Education 
during the last decade has been less than 2 years; NIE 
has had six changes of leeidership in 8 years. 

4 We analogize from a definition by Yin et al. (1976) of 
situations regarding the adoption of innovations! 
adoption is regarded as a positive outcome if the 
innovation leads to improvement but as a negative 
outcome if it does not; failure to adopt is a negative 
outcome only if the innovation would indeed lead ^o 
improvement but a positive outcome if it would not. 

5 Head Start teachers deciding to increase the time 
spent on prereading activities are as much decision 
makers in their realm as a superintendent installing a 
new curriculum, a state legislature passing an 
appropriation for compensatory education, or a federal 
program manager developing program regulations. 

6 Of course there is always a question as to who can 
represent beneficiaries. The Committee has made no 
attempt to address this question in^any detail, both 

. for lack of time and because we did not consider 
ourselves qualified to define such representatives. 
We note that there are groups that speak on behalf of 
specific beneficiary groups; their claims to represent 
these groups could, perhaps, be considered in the same 
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light as the claim of public officials that they 
represent the public, 

7 Evaluations done by individuals or units that also 
have operational responsibility for a program are 
generally mistrusted. How much more objective 
evaluation Becomes when it is done by third parties, 
but still under the auspices of the program, is not : 
clear, particularly when future evaluation contracts 
from the same source are a possibility. Evaluations 
performed or sponsored by units outside a program are 
not necessarily free of bias either, whether performed 
in-ivouse or contracted 'out, especially when top 
decision makers are known to favor particular points 
of view. 

8 Appropriate packaging has also been deemed important, 
but many counterexamples exist. For example, the 
attempt to develop social indicators resulted in a 
handsome publication (Office of Management and Budget 
1973, U.S. Department of Commerce 1977) with 
attractive and easy-to-read graphics, yet it has found 
limited use. 

9 As we discussed above, there are risks for 
bureaucracies of having to deal with new information. 
Other groups also run risks t for example, audiences 
concerned with equal educational opportunity may find 
negative results on programs they favor distasteful 
and disturbing. 

10 The distinction is not always clear. Sometimes, 
expectations for use at all levels are set up when 
data required at the federal level are collected by 
staff at the local level, as in the case of Title. I. 
In some cases, it may be most efficient to sponsor a 
study at the federal level even when the results are 
pertinent to individuals at the local level; for 
example, testing the efficacy of alternative 
strategies for teaching reading. 

11 The national-level evaluation of ESEA is not intended 
to take the place of the three-tier evaluation of 
Title I based on local data collection and aggregation 
at the state and national levels. Rather, it is a 
substitute for previous efforts at the national level 
to study the effects of Title I, specifically, the 
sustaining effects study (Dearman and Plisko 1979, 
U.S. Department of Health, Education, and Welfare 
1979a, Baker and Ginsburg 1980). 

12 As described in Appendix A, fiscal 1980 was the first 
year for which there was a comprehensive revieif of 
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evaluation plans from different components of OE, and 
that review did not include the relevant activities of 
NIE and the National Center for Education Statistics 
(NCES) • The new Department has attempted to institute 
a more centralized evaluation planning system; at thi3 
time, one cannot gauge the degree of its 
implementation or success* 
13 The changing role of GAO and its success in responding 
to new demands have been described by Levitan and 
Wurzburg (1979) and by Mosher (1979). Though Congress 
broadened GAO's mandate as eatly as 1945 to include 
monitoring of the administration of programs as well 
as of expenditures, it was not until 1967 that GAO 
became active in the field of program evaluation! a 
review of OEO's antipoverty programs was its first 
effort* In the succeeding decade, GAO has been 
changing its staff and organizational structures in 
order to carry out with greater effectiveness the 
increasing number of program evaluations undertaken by 
the agency. At present, studies carried out by GAO 
range from investigations of misallocation of funds 
within government agencies to impact evaluations o^ 
social progreuns and even to the evaluation of 
evaluations carried out by executive agencies (U.S. 
General Accounting Office 1977, 1978). 



5 

Organizing and Managing 
Evaluation Activities 



Many of the issues of quality and utiliiation discussed 
in the preceding two chapters are related to the way in 
which federal, state, and local education agencies 
support and sponsor federally funded evaluations* 
Dealing with those issues requires consideration of three 
major factors i 

• Responsibility* What kinds of evaluation 
activities is the Department expected to carry out as 
part of its oversight functions and of its effective 
management of federally funded education programs? what 
should it do for effective policy formulation? What 
ought to be the responsibilities of local and state 
education authorities? 

• Organization. How are the evaluation activities 
now organised in the Department and why? How should 
those activities be organized in order to maximize 
capabilities and incentives for producing reliable 
information and high-K|uality analysis? 

• Constraints. What are the impediments to 
producing evaluatiions of high quality' and using results 
effectively? Which of the impediments are the result of 
external constraints and which aredue_tQ_Jjiternal 

jprocedures?_ _Mhi^:h->of-the^'gxtefn^^ can be 

'^alleviated? How can internal processes be improved? 

Discussion of these issues reinforces a number of the 
recommendations made in earlier chapters, in this 
chapter we suggest guidelines for balancing the need to 
decentralize and to coordinate evaluation activities; we 
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also make some additional recommendations on improving 
the management of evaluations* 

RESPONSIBILITY FOR EVALUATION ACTIVITIES 

In Chapter 2 we discussed in general terms the different 
types of policy questions that are asked about 
established or proposed programs* In this section we 
consider what kinds of evaluations need to be carried out 
in order to address those policy questions for education 
programs* 

Accountability 

The Department is accountable for carrying out education 
laws in three respects x < ensuring that moneys are 
allocated as specified, ensuring that benefits go to the 
targeted groups, and ensuring that civil rights 
provisions and service mandates are being met. 

Fiscal Accountability 

Because of the decentralization of education, the 
allocation of funds for most major programs takes place 
at all three levels of governments federal, state, and 
local* (A few programs provide for f«tderal grants 
directly to local agencies*) Hence, all three levels 
must account for the use of federal. education funds, and 
fiscal reports from local and state agencies form the 
basis for the Department's own fiscal reports* Grantee 

reporting is checked periodically by the agencx.V8_^ — 

inspector general ^^For_a_ffiW__tit4;esr-l^ /ocational 
_fi.duc*it4on--«tatB''granti78u^ auditing is mandatory in 
law; for the most part, however, the Department has 
discretion as to what local and state reports and 
disbursements are audited* Nearly one-fourth ($10 
million) of all evaluation funds are spent on fiscal 
audits; generally # programs with large outlays (Title I 
of ESEA, post-secondary grant and loan programs) receive 
most attention (see Appendix A) • ^ 

^s audits have gone beyond checking for sound fiscal 
management and into checking for compliance with legal 
requirements on the use of funds, the line between fiscal 
audits and other accountability evaluations has become 
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fuzzy. For example, whether Title I money is used to 
supplant regular school funds or provide supplemental 
services to eligible participants (Martin and McClure 
1969, Stanford Research International 1977a, 1977b) has 
become an issue affecting the substance of what goes on 
in the classroom. . The early problems with supplanting 
have caused most school systems to provide ••pullout" 
proglrams that can be easily accounted for separately, 
even though they may not be the preferred educational 
option (National Institute of Education 1977). 



Accountability for Beneficiary Coverage 

Grantee reports have generally served as the most 
comprehensive source of information on program 
participation. Though local agtincies are obviously in 

•the best position to count participants, there are two 
problems with the use of such self-reporting: 
reliability of the reported data and lack of information 
on who is not being served. Reliability can be 
documented through third-party checks on grantee 
reports. If grantee reporting for a specific title turns 
out to be highly unreliable, technicjil assistance on 
interpretation of the law (e.g. , defining particijpanfee-^ 

^properly) may be warranted; alternatively ^^ijjcentlves and 
ffMctions that encourage misinterpr^tati^STTneed to be 
eximjined and adjusted to^bjUn^T^^Sntee performance and 
repi^ting in^ine..jv<^ith^'t^ intent, it is doui? * : 

thatJ^the-DcHEJartTO will ever be able or wish to rep^ 

r'ffantee reporting on beneficiary coverage, but it must 
accept responsibility for the accuracy of such reporting . 

How t\document the number of potential beneficiaries 
not being served is quite another matter, however. 
Establishing the universe of eligible participants falls 
under the heading of needs assessment. The incentives 
and disincentives for conducting accurate needs 
assessment may be strong ait the local and state levels: 
thei^e.is an incehtive when having more eligible 
participants means getting mote federal dollars; there is 
a disincentive when\ federal dollars ar^ accompanied by 
matching provisions ^hat call for greater contribution 
from local and state than from federal sources. At the 
federal level » there are also strong incentives: program 
administrators who do not want to see theitf programs grow 
are rare indeed,^ yet this responsibility is often 
assigned to a program office, as was the case in 
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developing P.L. 94-142 (Education for All Handicapped 
Children Act). Becduse of the Incentives , we conclude 
that ndeds assesBment ought to be carried 'out not by 
prograro.of f Icesr but by parties with no stake other than 
accuracy In the outcome. The cooperation of local and 
federal program managers Is nec«^sjqiry, however # since 
needfif assessment must be Informed^ Intimate knowledge 
0-£ptiifi local context and of potential program benefits. 

Accountability for Civil Rights Mandates . 

Accountability for civil rights mandates takes two 
different forms In education. The first Involves the 
enforcement of civil rights statutes In any way related 
to educational Institutions « whether built Into federal 
education legislation or decreed by federal courts , and 
Is based on federal responsibilities under the 
Constitution. At the same t^me, the provision of 
educational services Is con^ltutlonally a state 
Vesponslbllltyr delegated to local authorities.- 
Enforcement of statutes relating to civil rights and 
equal educational opportunity has become the 
responsibility of the Department because It can withhold 
federal funds In the event of noncompliance. As with 
fiscal accounting/ a separate office headed by the « 
Assistant Secretary for Civil Rights Is responsible for 
compliance r and It Is not considered an evaluation 
activity per se. 

The second form of accountability arises because some 
civil rights statutes require certain kinds of 
educational services* Two groups are specifically 
covei^^d In this manners all handicapped children are 
entitled tp a free tipproprlate public education under 
P.L. 9.4-142, and Title VXI of ESEA (In accord with the 
Lau court declslonlT requires Schools to provide 
Instruction that ^oes not put a non-English speaking 
child at a disadvantage. Such educational services that 
are spelled out In laws or In regulations tend to be 
based on peVfceptlons of conistltutlonal rights rather thar 
on social science evidence about needed -services. 
Consequently r monitoring activities may overlap. 
Responsibility for "compliance with service mandates may 
belong to , the program office, but selective checks are 
often carried out» by the Office of Civil Rights. An 
example Is the labeling and testing of handicapped 
children.^ Since these two kinds of offices tend to 
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respond to different constituency groups (minorities and. 
other targets of discrimination on one hand, school ^ 
systems aqd educational institutions on the other), they 
generally have distinctly different views* of what ought 
to be expected of grantees. Overlap of responsibilities 
is not undesirable if it is iocluded in overall 
evaluation planning r otherwise it leads to inefficient 
use of resources at best and antagonism between units of 
the Department at worst. 

o 

Program implementation 

Except for provisions connected with civil rights and. 
equal educational opportunity, federal education 
legislation often does not spell out mandatory 
educational interventions or treatments* The c? 
constitutional delegat:ion of responsibility makes 
decisions in. education a jealously guarded right of local 
and state authorities. Exceptions are such demonstration 
programs as Follow Through or Experience-Based Career 
Education, in which school systems are given^the choice 
of one of several specified curricula. Since the 
rationale of demonstration programs is developing and 
testing effective interventions, documenting the nature 
of the services provided through them ought to be an 
integral part of any evaluation research associated with 
them. There are also some ESEA titles that include 
explicit process specifications, such as the requirement 
for developing an individual education plan (lEP) for 
every handicapped child served under P. L. 94-142. In the 
case of such mandated educational processes, especially 
those instituted on little evidence as \o their effects, 
more than mere compliance checking is also needed. 
Evaluation should be carried out to find out the degree 
to which such processes contribute to the overall, goals 
of the legislation, "for example, to provide more 
effective education for handicapped children or~in. th^ 
case of bilingual education--for children whose native * 
language is not .English. Documentation of prograny 
process and implementation has ^been'carrled b6t at all 
•government levels and, within the Department, by both the 
cognizant program uni^s ifnd the central evaluation unit. 
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Program B££ect8 

*With £ew exceptions, federal funds allocated at the 
elementary and secondary levels are intended to bring 
about improved education for those students who have not 
been served -^dequately in the past. Because the total 
amounts spent are large, ^ Congress from time to time 
has called for information on program effects. In the 
past, the response i>y OE has been the commissioning by 
the central unit of large*-scale impact assessments that 
consume several years and millions of dollars, as 
exemplified by the sustaining effects study carried out 
by the Systems Development Corporation (1976, Baker and 
Ginsburg 1980) . There have been several problems with 
such efforts. First, what Congress often wants and needs 
is' information on effective delivery, in the sense of 
having accurate accounting for how a law is being carried 
out, as described above. Better specification of the 
questions to be answered in any legislation calling for 
assessment (as recommended in Chapter 2) would help avoid 
misdirected evaluation efforts; even more important is an 
ongoing dialogue on congressional needs between k^y 
congressional staff and Department staff responsik^le for 
evaluation. 

Second, even when assessment of program effects is 
called for, expectations of the size of those effects are 
. often exaggerated because of unrealistic promises during 
the legislative and appropriation processes. But by the 
very nature of federal education programs, effect 
expectations should be modest. Whatever educational 
service is envisaged as a result of federal dollars, it 
will be delivered in a decentralized manner through some 
16,000 local school systems in the public sector 
cc^mprising nearly 90,000 school buildings. There are , 
more than 2 million teachers^ in the public school 
systems, and another 250,000 people are teaching the 10' 
percent of students in nonpublic schools, (Private 
school students also receive benefits under Title I and 
other federal programs.) Federal programs operate at the 
fnargins of this huge enterprise, providing 8 percent of 
all revenue for public elementary and secondary schools 
(Dearman and Plisko 1979). Moreover, most federal 
programs are gearied to specj^fic populations; in those 
cas^s, support for core education, the major 
responsibility of the tei&cher, is expressly ruled out. 
Vet the children who receive benefits from any of the 
federal titles do not do so in isolation from the rest of 
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their education. Finally i as we have noted, federal 
programs more often than not have multiple and 
amorphously defined outcome goals, though they are 
usually explicit regarding distribution of benefits. To 
expect strong treatment effects under these 
circumstances—for example, improvement throughout the 
country in school achievement of a target group or 
lessening of racial tehsions— -is to ignore the nature of 
the educational system in this country. 

VHien the effects of a given program are modest, their 
estimation is a complex, difficult, and costly task. 
Such estimation should be done only when it is likely to 
affect program decisions (for example, in the case of a 
limited experimental program) arid only by the most 
competent evaluators and evaluation organizations. 



One of the Department's responsibilities is to provide 
leadership for improving education in this country; 
therefore, it ought to carry on a set of prospective 
activities designed to imprdve the substance of existing 
programs and to develop new progrsuns. The relevant 
evaluation activities are summarized in Chapter 2: needs 
assessment, identification of interventions likely to 
relieve the need, small-scale testing of proposed 
programs under optimal conditions, field evaluation under 
actual operating conditions, and analysis of likely costs. 

Such a process of program planning should operate both 
at the national level and in selected states and 
localities that have»^the resources. A similar set^f 
activities is relevant to program improvement, although 
the need and the general nature of the program may 
already be established. Too often, however, the 
exigencies of the budget process and the demands from 
those concerned with implementation of current programs 
relegate the planning of new programs and the improvement 
of established ones to a' low priority. The tracing of 
benefits already legislated and the assurance that 
prograims are carried out as intended take first 
priority. Development of knowledge needed to formulate 
better programs is a long-term process, with no assurance 
that the outcomes will be immedately applicable. In view 
of pressures for greater accountability and improved 
program management, it may be argued that activities 
aimed at the substance of programs should be relegated to 
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the research component of the Department, but such an 
asaignraont may lead to unfocused research not easily 
related to program variables that can become part of a 
federal education program. An interesting example of 
coordinated program improvement research exists for 
bilingual education, for which NIE, the program office, 
and the central evaluation unit all participate in 
evaluation and research planning. This kind of 
coordination recognizes that, particularly for existing 
programs, program managers should be involved in the 
design and testing of alternatives. They can provide the 
necessary experience regarding current program 
operations, and they are likeAy to have ideas for 
improvement. But the overall effort should be in the 
hands of research-trained people whose /full-time 
attention can be devoted to evaluation activities. 



Evaluation as a Management Tool 

In an examination of the use of social science 
information by federal executives, Caplan (1976) found 
that, in the Office of Education, more program evaluation 
was conducted and less of the information generated was 
actually used than in any other agency examined. It may 
be that, in its past emphasis on rigorous studies of 
program effectiveness, the central evaluation unit of the 
Department was not satisfying the information needs of 
the most powerful audiences, namely, the legislative and 
executive branch overseers. Their primary interest is in 
fiscal and beneficiary information, which provides an 
effective tool for holding managers at all 
levels— federal, state, and local— accountable for proper 
distribution of benefits. In fact, 0MB circular A-117 
(Office of Management and Budget 1979) requires both 
management and program evaluation of every agency 
(including an annual report) and ties this activity 
directly to the reward system for federal managers 
included in the recent civil service reforms. 

Problems are likely to arise, however, when 
accountability demands are taken beyond ensuring that 
resources are properly allocated. Who is to be held 
accountable for program effects that will probably be 
modest and difficult to estimate? As Cronbach et al. 
(1980) point out, condemnations of individuals for 
weaknesses or "failures- that occur in a system over 
which they have little control is a perversion of the use 
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of accountability. The delivery of federal education 
programs is a caae in point. Given that authority is 
diapersed and delivery of educational services highly 
decentralized, it is difficult to assign responsibility 
for program outcomes to specific institutions, let alone 
to sets of individuals such as teachers, superintendents, 
or federal program managers. This is not to argue that 
studies of program implementation and of program effects 
should not be done, only that they are unlikely _to be a 
useful management . tool. 

There is a second problem with using evaluations of 
program effects ^or trying to improve program 
management. The! fear that programs will be curtailed 
because of negative findings is aggravated in today's 
climate of tightening budgets. Even if in the past there 
have been few examples of established education programs 
that have been cut severely or abolished as a result of 
evaluation findings, the threat is real. Line managers 
and top officials wanting to build programs and budgets 
are not likely to cooperate enthusiastically in 
evaluations they perceive to have the potential of 
damaging their programs. 



How effectively is the Department now organized to carry 
out its evaluation responsibilities? Figure 3 
illustrates the organization of the Department as of 
January 1981; Figure 4 places the central evaluation 
unit, which carries major but not sole responsibility for 
evaluation, in its r^urrent context. 

For evaluation activities other than fiscal accounting 
and civil rights enforcement, legislation and 
administrative actions have created a hodgepodge of 
evaluation responsibilities and assignments, based more 
on the power base and history of individual programs than 
on rational planning. After an analysis of major 
education programs, Cordray, Boruch, and Pion found: 
"Programs differ markedly with respect to the number and 
types of evaluation mechanisms that are described within 
the law and by federal regqlations" (in Boruch and 
Cordray 1980, Ch. 3:7). Thus, states and localities may 
or may not be charged with producing performance reports, 
doing needs assessments, and carrying out studies of 
program improvement and program effects. For some 
programs, states are supposed to monitor local programs 
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or looal evaluation plans or both; for otherar there is 
no provision for review of local evaluations. Both 
Congress and the Department have been responsible for the 
present mixi Congress has attached dissimilar evaluation 
requirements to various categorical titles that 
distribute evaluation responsibilities differently from 
program to program; the Department (and its predecessor) 
have distributed evaluation responsibilities as much on 
the basis of the political strength of individual program 
administrators and their constituencies as on any basis 
connected with the quality or integrity of evaluations. 

There has been a central evaluation unit at the 
national level for a decade, but its responsibilities 
have varied, even as funding has increased (see Appendix 
A). After the unit was established in 1970, evaluation 
activities began to be centralized. The central unit 
acquired staff, a budget, and responsibility for national 
studies. This centralization was instrumental in 
Introducing rigor, integrity, and visibility to the 
evaluation efforts mandated by Congress and sponsored by 
OK, For several years, budgets and responsibilities 
increased. But as dissatisfaction developed with the 
perceived lack of timeliness and relevance of some of the 
studies — not to mention unhappiness with some findings 
deemed potentially damaging — pressure increased for 
certain programs to be responsible for their own 
evaluation activities. At present, some programs include 
virtually no evaluation activities other than obligatory 
program monitoring; others delegate evaluations to the 
central unit; still others conduct all their own 
evaluation Activities, in addition to the central unit 
and program units, evaluation activities are also carried 
on in the research unit (Assistant Secretary for Research 
and Improvement), the planning unit (Assistant Secretary 
foi' Planning and Budget), and at the Secretary's level. 
Until 1979, there was no overall evaluation planning or 
coordination of evaluation. 

Congressional restiveness with the performance of this 
nor<3iV5tQm led to still another layer, mandated 
corvj-ressional studies to be carried out by a designated 
unit: NIE in the case of studies on compensatory 
education and on vocational education, NCES for a study 
on discipline in the schools (P.L. 93--580) , and the 
Secretary's office in the case of a study on school 
finance r. 
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QUIDBLINB8 FOR OHGANIZATION 

It la neither neoeasary nor even dealrabJLe that the 
organliatlon of evaluation activities be precisely the 
aame for eaoh education program. But the current 
accretion of Idiosyncratic evaluation legislation and 
Internal aaalgnments originally made for political and 
administrative reasons bears reexamination In the light 
of some reasonable criteria, such asx the type of policy 
question to be asked and the Information needed; the most 
effective and efficient ways of obtaining the needed 
Informatloni the Intended use of the Information 
(likelihood that use will occur may depend on how and by 
whom the Information Is generated); the size and nature 
of the program; and the research capacity of the unit 
considered for assignment of evaluation responslblllt-y. 
The application of such'crlterla will Indicate what 
changes might be made to Improve the current organization 
of federally funded evaluation activities related to 
education. But since there Is no one best way to 
organize these activities, the Implications the Committee 
has drawn from the preceding discussion are presented 
below as suggested alternatives rather than as 
recommendations . 



Centralization Versus Decentralization 



Organizational researchers and management experts have 
debated the merits of centralized organization compared 
with those of Incrementallsm and mutual adjustment 
brought about through coordlnatlve mechanisms among many 
autonomous units. Bach form of organization has Its 
costs as well as Its benefits. Central organization can 
lead to more coherent activity, but It Is time-consuming 
as the decision process works up through the hierarchy 
and back down for execution. It may also seem capricious 
and arbitrary, especially in complex situations and 
situations of uncertainty* Such conditions are 
characteristic of most evaluation planning related to 
social programs. On the other hand, while decentralized 
planning and execution can come closer to satisfying 
needs of Individual units at the federal, state, or local 
level. It can lead to duplication, wasteful use of scarce 
human and fiscal resources, and low quality. Attempts to 
minimize these negative consequences through purposeful 
coordination will, like other centralizing mechanisms, 
exact high costs in time. 
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The Conunittee bellevea that the different evaluation 
questions that need to be addressed concerning federal 
education programs are now so diverse and of auoh varying 
importance to different audiences that decentraliasation 
is warranted. But responsibilities should be assigned in 
a somewhat more planned manner than at present. There is 
agreement within the current Office for Management, which- 
haa overall responsibility for program evaluation^ that 
some evaluation activities need to be decentralized; in 
factf present law and custom so dictate. But planning 
directives for 1980 manifested an attempt to reoentralize 
evaluation activities through review and approval by the 
central unit of all evaluation plans. No parallel 
attempt is evident with respect to evaluation activities 
funded by federal funds at the state and local levels, 
except to provide technical assistance in the case of 
Title I evaluations. 



Decentralization Among Levels of Government 

As noted, evaluation requirements. levied upon local and 
state agencies vary from program title to program title. 
(For summary descriptions of requirements in major 
titles, see Cordray, Boruch, and Pion, Ch. 3 in Boruch 
and Cordray 1980) . Generally, reporting requirements 
appropriately emphasize the collection of information on 
beneficiaries served and on distribution of resources. 
For a number of titles, the states carry the 
responsibility of aggregating data provided by each local 
education agency. But state-level reports have seldom 
been able to make statements about how programs operate 
throughout the state as a whole, partly because local . 
agencies were not reporting data^ of sufficient quality 
and uniformity to allow aggregation. Consequently, 
states have also acquired some responsibility for 
technical assistance. For certain titles, localities are 
also required to identify the number of individuals in 
the target population (for example, for the handicapped 
covered in P.L. 94-142).' Since identification of 
individuals generally leads to the need to serve them, 
and federal funds by no means pay the total cost of 
service, there are considerable disincentives to 
comprehensive needs assessment carried out by local 
agencies. 

In addition to reporting on the distribution of funds 
and on the numbers and types of both potential 
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participants and those actually aervadr acme tltlea 
caqulca copocta on "af Cootiveneaa, " In moat caaea, 
elfCactlveneaa tucna out to be the degree to which the law 
is being implemented, i.e., whether program aervicea are 
being provided aa specified in law and regulation^, A 
few local and atate agenoiea also carry out evaluations 
concerned with educational effectiveness. In many cases, 
however, major expenditures of their own funds reported 
by local agenciea as evaluation of program effectiveness 
are for testing designed to track general student 
achievement rather' than specific effects traceable to any 
one program. It appears to be the intent of current 
requirements that local evaluations serve auditing and' 
monitoring purposes while at the same time also informing 
local program developers and administrators on the best 
implementation strategies. As illustrated by the history 
of Title I evaluations (summarized in Chapter 4), 
stipulations for local and state evaluation activities 
have shown a confusion 'of purpose between assessing the 
extent to which programs are providing benefits and 
mandated services and determining ways in which local 
.programs might be improved. Local evaluators are forced 
to use designs and methods to collect data that can be 
aggregated at the state and national levels, but such 
data do not serve the local needs well. Moreover, those 
data have not even proved useful in providing statewide 
or nationwide overviews; separate state or national 
s'ti^dies have been needed for that purpose. Though some 
data collected at the local level might serve both local 
and national purposed, each type of evaluation question 
has distinctive design and measurement requirements (as 
discussed in Chapter 2) and implies different 
relationships among the three levels of government. 

We have noted in Chapter 3 the variable quality of 
evaluation activities carried out at the local and state 
levels and have recommended that Congress consider a 
diversified strategy of evaluation requirements at these 
levels (Recommendation C-3) . in Chapter 4 we discussed 
the need to build in the concerns of target audiences 
from the beginning to increase the likelihood that 
evaluation findings will be used. Consideration of how 
scarce evaluation resources can be best employed to yield 
reliable information that is useful to the maximum number 
of audiences reinforces the notion that division of 
evaluation responsibilities deserves more careful 
analysis than it has received. 

All grantees receiving federal funds for education 
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programa have atew«rdahlp reaponalbllltlefli At a 
mlnimumi thoroforoi all auoh granteoa ehouXdl oontlnue to 
be requlrad to report on the allocation ot tundei on the 
numbers o£ bene£loiar lea aervedi and on oomplianoe with 
the law where aervioea and prooeaaea are spelled outi 
But considerably more thought should be given to the 
amount oC eu'ch information that can be digested at the 
atate and the federal levela* The impresaion peraiata 
that grantee application and reporting requirements are 
intended to cover all bases and collect every conceivable 
bit of informationi creating such an overload that most 
ot the data pour in without being scanned^ let alone 
used. For example « in the migrant education programi OE! 
required the states to send copies of all subgrahts to 
OE. According to the program auditorSf this mountain of 
information simply collected dust in a storage area with 
no attempt made to review it (Hock 1980). The practice 
was ended as a result of the program audit. More 
carefully considered requirements would reduce costs and 
response burden and provide fewer and briefer reports 
more' likely to be reviewed. 

Requirements that go beyond the basic reporting needed 
for accountability functions should not be levied on all 
localities and states alike. Questions on how a program 
actually operates in the school # questions on the 
detailed nature of the services and variations in 
different localitiesr and — most difficult of 
all — ques^tions on the educational effects traceable to a 
specific program need not be answered by all localities 
or grantees. Cost effectiveness questions dealing with 
the desirability of different program alternatives are 
probably an even less appropriate requirement at the 
local and state levels. Scarce evaluation resources are 
frittered away when demands are made of all that could be 
responded to more effectively by selective sampling in 
nationwide studies or by studies carried out by 
individual local systems or states with proven competency 
and sufficient fiscal and human resources to evaluate 
their own programs. These considerations lend additional 
force to the recommendation made earlier: 

Recommendation C-3. Congress should institute a 
diversified strategy of evaluation at fche state and local 
levels that would_leyy minimum monitoring and compliance 
requirements on all agencies receiving federal funds» but 
allow only the most competent to carry out complex 
evaluation tasks. 
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Vo thlMi wf Add « r«oomniondatlon r«gArdln9 th« 
Otp«rtm«nt*ft rtaponiiblllty. 

RtoomiMindttion D*16« Tht Dapartmant of Bduoi^tlor^ a^^ou^d 
qlf^cly BPaU out minimum raqulrtmorfta goc monitoglng f^pd 
compllanot rtporttng and aet Btandarda for matting the 
raquiramaritar " * '-^^ — 

Tha objaotive of thia ceoommandation in to improve the 
quality of data needed for aooountability without 
Inoreaaing the burden of reaponaa on looal and atate 
agenoiea. Suoh data Itema aa dlatribution of funda, 
number end typea of benef ioiarlaa being aervedi and 
apeolfio program aervioea ahould be defined by the 
Department ao that looal and atate agenoiea know exactly 
what repojrting ia required of them. Quality control 
procedurea ahould be enforced so that performance reports 
can be made to Congress. Before setting the 
requirement's, however, the Department needs to examine 
its own capacity to deal with local and state reports so 
as to avoid collecting information that is never used 
because of the sheer inability of federal staff to deal 
with the volume. 

In order to assist agencies in complying with federal 
reporting requirements, the Department should extend 
technical assistance as recommended above (Recommendation 
D-8) .t One way to provide such assistance would be to 
select local and state agencies doing an exemplary job of 
reporting, if none exists, the Department should fund 
the development of such examples. Care must be taken to 
select different types of locales exhibiting a variety of 
student, teacher, and resource mixes. The exemplary 
procedures should then be actively disseminated through 
existing channels, for example, the Department's regional 
offices, the Title I TACs, the NDN, or the state agencies. 

A second way to provide technical assistance would be 
to make funds available to selected exemplary local 
agencies to provide technical assistance on meeting 
reporting requirements to less skilled school systems of 
comparable type — something like the 
''developer/demonstrators'' funded by the NDN (Par West 
Laboratory for Educational Research and Development 1979) 
to provide training^ materials, and technical assistance 
for adopting exemplary education programs. After the 
first 2 or 3 years, such funding should be based on the 
success of afh agency designated to provide technical 
assistance in improving the reporting of those receiving 
the assistance < 
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P9aentrAU«At;lPn Within th« Pepartmant 

Pi^£«r«nt QvAludtlQn AQtlvit;U» «ir<i AppcaprlAtoly iPOAtcid 
in aiEfeCQnt units ot tho Doportmant; to tAk^ AdvantAge ot 
Inocintiv^a Cor wainQ r«aulta And ofi atAfC interoate And 
oomp«tanoioH. Uaing the typoiogy dev^^iopad in Chapfcac 2, 
wo auggaat ganarAl guidalinaa tot: loPAtlng avAiuAtion 
aativitiea within tha DapACtmant* 

Tha OCCioa ot tha Inspactor GanarAl ahouid oontinua to 
(nonitoc whathar £unde ara diatributad acaording to Xaw 
and Aca allooAtad Cor the praaaribad purpoaaa. Whan 
quaationa arioa aa to whathac auah additionAX aarvlaaa Ae 
tha law mAndAtaa Aca baing pcovidad to tha tacgat 
population (a) (rather than the Cunda being uaed foe 
regular aohool opecationa)> they need be invaatigatad 
through evAluation atrategiaa and methoda appropriate to 
documenting the nature of program inteicventiona.' Thia 
type of evaluation requires reaearoh oapabilltles beyond 
the scope of the Office of the Inspector General. 

Accountability questions on beneficiaries adrved and 
on program delivery should be monltorqd by officials who 
administer the programs at the federal levels namely the 
Assistant Secretaries for Elementary and Secondary 
Education, for Special Gducatlon and Rehabilitation 
Services I for Post-Secondary Education, and for 
Vocational and Adult Education and the Director of 
Bilingual Education and Minority Languages Affairs. 
Responsibilities should Include the fnonltorlng of program 
coverage and of provision of services mandated by law and 
regulation (Including such associated requirements as the 
setting up of parent advisory councils). Where civil 
rights laws h' . Involved, the Assistant Secretary for 
Civil Rights ha and should continue to have 
responsibility » Much of the information on program 
coverage and delivery should be obtainable through 
focused grantee reporting using adequate quality control 
and technical assistance measures, as discussed above. 

There is continuing need for a central evaluation unit 
to carry out activities not directly linked to program 
accountability. First, the unit should sponsor, on a 
sample basis and in cooperation with the program unit, 
documentation of program process and detailed 
implementation so as to provide Insight on how 
educational services have been changed. Second, also In 
cooperatlS'h with the cognizant program units, the central 
unit should support program Improvement or development 
studies. Including needs assessment and understanding of 



program Qont«icbf thu fe«»fclng ot promising «lt«rn«tiv«i 
program itratogits, and analysM qf t;h« aCCttoiia oC 
propoiad ohangaa in law or rtgulation. Third, whan the 
iaiMa ii fduoationa; aeetQtlvana0af tha unit should oarry 
out--in oooparation with fcha program oCeioaa-^pnaadad 
•valuability atudiaa to daCina objaotivas and appropriato 
maaiurai. Only i« auQh maasuraa oan ba auQaasaCully 
aatablishad and only It a program is oe tht typa and at 9^ 
•tagt to allow impaot evaluation («aa chapter Z) ^ ehould 
auoh a study be undertaken and then only if the need for 
it oan be justified. 

The reason for assigning shared resonsibiiity for 
these aotivities is that program administrators 
presumably have in-depth Knowledge of their programs and 
an interest in improving educational substanosi but they 
may also have a vested interest in oiirrent operations. 
At the same time, the central unit is likely to have less 
program expertise but a greater concentration of 
evaluation talent and social soienoe expertise. When 
such talent and expertise can be found to an adequate 
extent in ai program office, It may take the lead, with 
the central evaluation unit as the cooperating office. 
The, central unit should also, from time to time, run 
checks on accountability information developed by program 
offices and the Inspector General and, when necessary, 
(?onduot Its owa ^I'tudlea. Precisely how all these 
evaluation responsibilities are shared between the 
central unit and program offices ought to be k function 
pf the expertise residing In each program office. 

Three functions are appropriately shared between the 
central unit and NIB (which Is under the Assistant 
Secretary for Educational Research and Improvement). The 
first Is cost-benefit studies designed to establish the 
efficiency of alternative ways of obtaining the 
objectives of a given program. Such studies require all 
the expertise needed for assessing program' effects and 
tying them to specific components of the program 
treatment, m a'ddltion, benefits and costs of the 
program must be put in monetary terms, a difficult 
conceptual problem. Cooperation with NIE Is suggested 
because of the breadth of skills required and because It 
may be necessary to conduct basic research In how to do 
cost-benefit studies In education. Each particular 
instance of doing such a study will provide material for ' 
theoretical research and should be fully Informed by It. 
The two unltq should also jointly administer the types of 
grant programs suggested In Chapter 3 for local and state 
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aoop^pafc© in the ^vflluAtion res^arah prpgr#im P«gammend«d 
In Chapter 31 tov developing now methodology and foe 
Inveetlgflting evaluation ptoqeeeea (eee Hegommemlatlon 

Kv^luatlon aotlvltiea not dtrepfcly related to a 
partipular federal program, eepeolally fchoae oonQerned 
with developing Knowledge on more efCeative eduoatlonal 
Interventlonflr should he aupported or garrled out hy the 
reaearah arm of the Department, 'that la, NIK and other 
unite wUhln the ofMoe ot the Aeelatant Seoretary for 
Kduoatlonal Reeearoh and Improvement, 



Coordination 

Deoontrallaatlon oroatoa the problem of offeotlvo use of 
evaluation dollars that are diaperaed among three levels 
^ of government and among many unlta of the Department pf 
Education. A flret but not aufflolent requirement to 
address this problem la adequate reporting. The laoK of 
Information on the amount of evaluation dollars apent at 
the atate and local levels haa already been diacuaaed, 
but even accounting foe evaluation dollars within the 
Department becomes a matter of definition, depending on a 
particular unit's need or desire to display or hide Its 
evaluation activities.'* In Chapter 3 we recommended 
that Congress segregate evaluation funding at the state 
and local levels from program funds and administrative 
costs and require an annual accounting; we repeat those 
recommendations here. 

Recommendation C'*2. Congress should separate funding for 
evaluations conducted at the state and local levels from 
program and administrative funds . 

Recommendation C"4. Congress should require an annual 
report from the Department of Education on all evaluation 
activities and expendituresy including thoge at the state 
and local levels . 

The central unit should be responsible for preparing y 
the annual expenditure report and an overviev/ of the 
substance of all evaluation activities pai<1 for by 
federal education funds, as it does now' for its own 
activities. 
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Beyond reporting, however, the central unit should be 
responsible for coordination o£ evaluation throughout the 
Department. Coordination is critical because of the 
interorganizational complexities discussed in Chapter 3. 
Many different parties within the Department have a stake 
in evaluation, roost especially the operating program 
units and the planning component, which is currently 
separated from the central evaluation unit. (See the 
diecussion below on the placement of the central 
evaluation unit*) Coordination also should contribute to 
more efficient use of evaluation resources « For the four 
phases of evaluation — planning, design of specific 
studies and procurement mechanisms, review, and use of 
findings—there are several ways in which authority and 
control could be disttiUited, i.e., in which evaluation 
activities could be coordinated: 

1. The head of the central evaluation unit or 
cognizant assistant secretary could have both the 
responsibility and the authority <that is, final sign-off 
power) for approving plans, design and procurement, 
findings, and their dissemination. Insofar as possible, 
this person (office) could also set up incentives for 
application of findings or sanctions against nonuse. 

2* The central unit could have major responsibility 
for coordination of planning, for review of designs and 
quality of procurement (but no sign-off power), and for 
review of findings together with the initiating unit, 
with dissemination also shared with that program unit. 

3. Besides carrying out its own projects, the central 
unit could provide technical assistance (when asked) to 
other units engaged in evaluation activities^ but have no 
further authority or responsibility. In this case, 
coordination responsibility or authority would either be 
assigned to some other level (say, the Secretary's or 
Undersecretary's office) or not assigned at all, as was 
the case for the Education Division within HEW until 
recently. (While HEW's Assistant Secretary for Planning 
and Evaluation received evaluation plans from the whole 
Education Division, generally only those from the central 
unit were reviewedi see Appendix A.) 

The Ccmonittee believes that;; for each phase of 
evaluatiO(^*^ a ^lifferent d«gre«* c:f sharing of 
responsibility and authority i% appropriate. 
Relationships should also vary ^«pending on the nature of 
the evaluation activity and the degree of expertise 
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residing in offices other than the central unit. We make 
some suggestions below as to coordinating mechanisms that 
strike a balance between totally centralized decision 
making (option 1 above) and autonomy for each unit 
(option 3 above). But we recognize that any (or no) 
coordination comes at a cost. The costs of no 
coordination at all include not only the wasteful use of 
evaluation dollars, but also the failure to use 
evaluation findings and the inability to cumulate 
knowledge about programs. The cost of any degree of 
coordination is time — more staff time for communication 
and more executive time for making decisions. Therefore, 
no matter what coordinative mechanisms are adopted, the 
Committee suggests that both the time invested and the 
results be tracked with some care, so that the effort to 
use evaluation resources wisely does not end up leading 
to negative results. For example, staff may get so 
occupied with meetings, with defenses a<jainst criticisms, 
and with waiting for decisions that they have inadequate 
time to produce procurement requests of high quality, to 
effectively monitor evaluation studies # to respond to 
modification requests from contractors or grantees, to 
review reports in detail, or to disseminate findings. 
Tracking of how well coordination procedures work should 
lead to their reexamination periodically, perhaps every 3 
or 5 years. The rest of this section presents our 
suggestions for the Department with regard to 
coordination at each stage of the evaluation process. 



Planning 

We believe planning should be centralized, with all 
units — program, policy and planning^ budget, research, 
etc. — involved at the staff level and with sign-off s 
required by each assistant secretary. The assistant 
secretary responsible for evaluation should take the lead 
for the coordination of planning. jThe central unit 
should carry responsibility for developing, together with 
the cognizant program units, a coordinated plan, 
including series of related studies, for each of the 
large federal education programs, as exemplified by those 
for Title I and P.L. 94-142- The central unit also 
should be charged with the coordination of all evaluation 
planning, even though the planning and execution of 
specific studies may be carried out elsewhere — a program 
office, the research unitf or even the local or state 
level . 
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We note the current attempt by the central evaluation 
unit to coordinate plans for fiscal 1981 and fiscal 1982 
(see U.S. Department of Education 1980b, 1980c). We 

'suggest coordination of planning not because we believe 
that cor^rol of all evaluation activities should be 
lodged in^the central evaluation unit — we do not — but 
because ther^ appears to be no overall evaluation 

- plannlng~wtth^^^^^^^ prior it ies for the 

Department.' Un^il the Department develops such plans, it 
will be subject 'to ad hoc, arbitrary changes in 
direction. Such changes prevent the cumulation of 
incremental program information of the kind* needed by 
decision makers both in Congress and within the 
Department. Improved evaluation planning will clarify 
data and information needs for evaluation and allow the 
Secretary to assign priorities to them in the context of 
other data gathering needs. Reconunendation D-10, which 
speaks to this issue, is repeated here: 

Recommendation D-10. The Department of Education should 
institute a flexible planning system for evaluations of 
federal education programs . 

In Chapter 3 we emphasize that planning for evaluation 
cannot be a totally internal activity. Outside groups 
having a stake in a program must be consulted, since the 
Department's top priority external audience is Congress, 
the Department needs to develop better liaison regarding 
evaluation activities with members and with congressional 
staff. Congressional aides have been very critical about 
the relevance, timeliness, and packaging of evaluation 
reports (see Zweig 1979) . More involvement of 
congressional staff is needed in selecting basic issues 
and questions that can be answered by the evaluation 
process. The central evaluation unit, being more removed 
than program administrators from the politics surrounding 
particular education programs, should be charged with the 
responsibility of communicating with Congress about 
evaluation needs (see Recommendation C-1) . Program 
units, on the other hand, tend to be closer to such 
constituency groups as representatives of target 
populations and educators charged with carrying out the 
programs; therefore, they should be responsible for 
obtaining their participation in the planning for 
individual studies as well as in the development of the 
overall plan. 
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Design of Studies and Procurement 

Technical committees drawn from the staff of the central 
evaluation unit and from the Office of Educational 
Research and Improvement, supplemented by staff from the 
originating unit (if other than either of these two) 
should review and comment on all design and procurement 
documents. Final veto or sign-off power, however, should 
not reside with these committees but with the cognizant 
assistant secretary supervising the unit that prepared 
the design or the procurement instruments or grant 
guidelines. If technical or substantive criticisms are 
made by the reviewing committee, Kh^- cognizant assistant 
secretary should require response* from the originating 
unit that either refute the criticisms or indicate 
changes made as a result. If the central unit is the 
sponsor of the study, the process should be reversed, 
with the relevant program unit providing review. The 
central unit should also have staff available to provide 
technical assistance during the execution of a study, 
that is, when staff from other units monitoring an 
evaluation contract might call for assistance in 
reviewing progress or authorizing changes in study 
direction, design, test instruments, analytical 
strategies, and the like. 



Review of Findings 

The process for review of findings, either at an interim 
stage or in final reports, should be similar to that 
suggested for the design and procurement of studies. 
Technical committees drawn from the staff of the central 
evaluation unit and the Assistant Secretary for 
Educational Research and Improvement (possibly the same 
ones involved in the design and procurement phase) should 
review reports and associated materials. Comments should 
be forwarded to the originating unit, with a requirement 
for rebuttal or incorporation of changes responsive to 
the technical review. Program units should be afforded 
the same review opportunity for studies originating in 
the central unit. These internal reviews of designs and 
of findings should be preliminary to the external reviews 
suggested for each of these phases in Chapter 3. 
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Dissemination and Use 

As recommended in Chapter 4, the originating unit should 
have the responsibility of building a dissemination and 
use plan into its original procurement document and of 
ensuring that such a plan is part of the accepted 
proposal and subsequent contract or grant/ The 
originating unit's dissemination plan woul4 be reviewed 
along with other features in the design and procurement 
phase. The originating unit should have the 
responsibility for carrying out the dissemination plan 
addressed to the primary audiences # who presumably are 
closely tied to the originating unit. The central unit 
may carry out dissemination to secondary audiences as it 
deems appropriate. 

The central unit should also serve as the storehouse 
and coordinating center for information derived from all 
evaluation activities^ including not only studies 
originating in the Department^ but also those carried out 
by state and local agencies and even work relevant to 
education that may not have been federally funded or be 
concerned with federal education programs. The unit 
should be responsible for cumulating knowledge from these 
sources, reanalyzing data, and refocusing information 
necessary to suggest changes in legislation, in 
regulation, in program management, or in program 
intervention as evidence indicates. Other units, 
particularly the Department's research arm, should 
cooperate in this integrative function. 

Functioning as something like a nerve center for 
evaluation information, the central unit should also be 
charged with getting relevant information to audiences 
that can act on it or are likely to have an interest in 
it, beyond the audiences already included in the 
dissemination plans for a specific study, as noted in the 
following recommendations from Chapter 4: 

Recommendation D-13. The Department of Education should 
ensure that dissemination of evaluation results achieves 
adequate coverage . 

Recommendation D-14. The Department of Education should 
observe the rights of any parties at interest and the 
public in general to information generated about public 
programs . 
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Repongnendation DrlS* The Department of Education should 
qivg attention to the identification of "r ight-to^know" 
usox audi^^nceB and develop strategies to attend to their 
infCig mation needs . 

To oarry out these functions, the central evaluation 
urlt should have a dissemination arm. Such a subunit 
^>^ald also devote time and energy to the communications 
problem* Too many evaluation reports are cloaked in 
jargon that is unintelligible to decision makers and; 
other nontechnical audiences* Although most evaluation 
contracts now specify that an executive summary must 
accompany the final reports insuf f icie:'it attention to 
effective packaging of evaluation findings continues to 
be the rule.- Too many reports are not read or not 
understood by busy policy makers or by outside groups 
that could use the information because the language of 
the reports is unclear« There is a real difference 
between ambiguity of findings, which can be expected for 
large# complex programs that encourage local variability, 
and the inability to present those findings in 
understandable prose* Personnel in the central unit 
charged with responsibilities for disseminating 
evaluation findings must perform the translation from 
scientific jargon to clear English when such translation 
has not been adequately done by contractors or grantees. 
In order to be effective in this role, however, central 
unit dissemination staff must possess requisite 
communication skills and must be insulated from political 
pressures that otherwise will quickly undermine the 
credibility of their work. 



Location of the Central Evaluation Unit 

We have proposed that the central evaluation unit be 
charged with important coordinating responsibilities in 
developing the Department's overall evaluation plan and 
in synthesizing and disseminating evaluation-relfited 
knowledge derived from all sources. We do not foresee 
that these responsibilities can be adequately carried out 
as long as the central evaluation unit is subsumed within 
the management arm of the Department. The implicit 
message of this arrangement is that only the management 
perspective of evaluation is considered a high priority* 

While some members of the Committee favor the 
assignment of an assistant secretary to the evaluation 
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function and other members disagree with this particular 
approach, all members agree that evaluation Ts currently 
too far removed from the top policy circles in the 
Department. This distance makes it unlikely that the 
central unit would be able to effectively coordinate 
evaluation activities across the Department. Yet this 

unit is prpbably the only one that could provide the 

Secretary with a comprehensive view of the airi^^ 
mdney being spent for evaluation i of the types of 
evaluations under wayi of the effectiveness of the 
various disparate parts of the evaluation "systemi" and 
of the potential for using study findings to make more 
informed decisions about programs. 

A variety of administrative mechanisms can be used to 
improve the current situation. For. example » the 
Department could make the unit a separate office 
immediately responsible to the Secretary or the 
Undersecretary to provide the needed access and ^ 
credibility. A precedent exists in the case of the 
Office 9f Bilingual Education and Minority Languages 
Affairs^ Another possibility for making the unit more 
effective is to couple it ihore closely to the major 
planning function* We would caution i however i that some 
separation should be maintained between evaluation and 
budgeting. Though these functions are often located 
together, subservience of evaluation to the budgetary 
process is as counterproductive as using evaluation to 
chastise or reward individual program managers i 
apparently the Department's current direction. If 
budgetary decisions and the handing out of rewards or 
sanctions are to be the main functions of evaluation 
activities I they will be devalued as a means for program 
improvement. As long as evaluation is seen as a 
threatening rather than as a supportive activity i those 
who are subject to the threat will find ways of defusing 
it by coyert lack of cooperation or outright opposition. 
As a result I evaluation activities will continue to be 
curtailed, and results consigned to the dusty shelves of 
unused reports. 

CONSTRAINTS 

No matter how evaluation responsibilities are assigned 
and organized, the Department has to face some important 
constraints that are only partly under its control: 
constraints of budget, of staff, and of process. 
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Budget Constraints 



Pressures to reduce the^'^federal budget have taKen their 
toll of evaluation projects since many such projects are 
discretionary items. In fiscal 1980, the central 
evaluation unit wa? not ab\e to initiate any new studies 
except those expressly mandated in law or made possible 
through specific set-asides for evaluation (for example, 
the half-percent of program funds mandated for national 
evaluation of" Title I). However, as a consequence of the 
dispersion of evaluation responsibilities, the central 
unit spends less than half the money invested in 
evaluation at the national level: $19.6 million of the 
$43*4 million estimated for the whole Department 
(including the inspector general) in 1980. (For ai 
estimate of evaluation spending by various components of 
the Department, see Appendix A.) As already noted, 
additional federal funds are spent at the state and local 
levels for evaluations. With respect to accountability 
of spending for evaluation, then, there is trifurcation 
of responsibilities: the central evaluation unit, 
program units of the Department^ and states and 
localities. But only the central unit has been the 
object of major scrutiny and a decreasing budget, while 
responsibilities and funds are idiosyncratically assigned 
by legislation or executive practice to selected federal 
program offices and to state and local authorities, often 
without similar scrutiny of performance. 

In the last 3 years, the Department has not been 
successful in convincing the appropriations committees of 
Congress that an increased budget for the central 
evaluation unit was warranted, even while authorization 
committees have asked for more evaluation. In fact, 
funds have been appropriated for evaluation activities 
outside the central unit, and Congress has spent 
additional funds on its specially commissioned studies* 
These actions appear to reflect an inability to make a 
convincing case for the work of the central unit, 
although it is not clear whether the apparent 
dissatisfaction leading to decreasing budgets has been 
warranted by inadequate performance or has been due to 
greater visibility and scrutiny. 



Staff Constraints 

We have commented previously that the complexity of any 
evaluation process beyond tracing money and counting 
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people calla £or particular technical skills and social 
science knowledge. Staff members responsible for 
evaluation programs should be well grounded in the theory 
and technical knowledge of a variety of social and 
behavioral science disciplines. They must also be in 
touch with the perspectives represented by various 
interest groups who represent program beneficiaries and 
service providers. Having practical program knowledge 
and experience is helpful as well, though this can be 
supplied through cooperation of the relevant program 
units. 

The staffs of evaluation offices have to be able to 
explain issues involved, to develop questions to be 
answered, to suggest methodologies for research, and to 
prepare statements of work for RFPs and other procurement 
documents. They have to participate in panels that 
establish criteria and make recommendations for the 
selection of winning contractors. They are also likely 
to negotiate substantive contract issues before awards 
are made. After a contract is awarded, the cognizant 
staff person. or project monitor must be able to provide 
technical assistance i£ needed by the contractor, assist 
in clearing survey instruments, and rule on modifications 
requested by the contractor. In order to respond 
effectively to contractor requests, the staff person 
needs to understand through first-hand research 
experience whether requested changes are appropriate or 
hot. Throughout the course of a projects staff members 
must provide professional review, including careful o 
examination of final reports. 

The unusual array of skills, experience, and diverse 
perspectives needed to manage evaluation programs is not 
easily obtainable. The Department is limited in its 
ability to recruit top-quality staff in adequate numbers 
because of personnel ceilings and other civil service 
constraints. The Committee has not had time qr 
opportunity to assess the qualifications of the staff in 
the central evaluation unit^ though there are obvious 
gaps in disciplinary expertise, in the representation of, 
minorities (see Chapter 3), and in hands-on experience 
with field-based applied; Research studies of the kind 
being designed and monitored by the unit. What seems 
clear* however, is that the current deployment of staff 
and .assignmen,t of responsibilities does not take 
advantage of the collective expertise in the central unit 
and in the research components located elsewhere in the 
Department (for example, in NIE or the National Institute 
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of Handicapped Reseiarch) . External requirements jSind 
internal practice with respect to planning, procurement, 
and clearance have severely constrained the time needed 
to do quality work; the combined effect of 
conceptualization of large-scale studies by single 
individuals or small groups (as has been the practice in 
the central unit) and the need for early closure on 
technical detail is to leave little room for creativity. 
Nor is it likely that the expertise represented by the 
central.unit is duplicated in every program office^ with 
evaluation responsibilities. In some cases, evaluation 
work carried out elsewhere in the Department may open up 
innovative ways of planning and designing studies, as has 
been true for the NIE compensatory education study and 
the evaluation plan for P.L. 94-142. Both these 
instances come from units with research expertise. Other 
program offices, however, are unlikely to be able to 
staff up for the evaluation responsibilities now assigned 
them or that they might acquire in the future* 

Recommendation D-17, The Department of Education should 
examine staff deployment and should establish training 
opportunities for federal staff responsible for 
evaluation activities or for implementation of evaluation 
f indings . 

The Department should consider alternative ways of 
using the technical staff within the central unit and 
evaluation staff in other units. Duties and 
responsibilities would vary according to the amount of 
government control exercised by staff; grants and 
consultancies entail the least controls contracts and 
evaluation teams configured of government staff and 
outside experts more, and in-house studies the most. 
Figure 5, adapted from one originally prepared by Wargo 
(1980), illustrates the three major relationships between 
government staff and outside experts and some of the 
characteristics of each alternative. The Department has 
largely used the contracting method, though in-house 
analysis has been characteristic of selected areas, 
particularly for postsecondary programs. There may be 
evaluation work that is better addressed by the 
grant/consultantship method (see Chapter 3) or by an 
evaluation team. In part, the choice depends on the type 
of evaluation work to be undertaken, but staff capability 
is an equally in\portant criterion. The greater the 
degree of government involvement, the greater the skills 
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and the greater the number of personnel that are 
required. The grant/consultancy method allows maximum 
contribution from the field; the evaluation team concept, 
though it requires skilled staff, still allows / 
participation by outside experts while making possij>le. 
quick response (see Recommendation D-11) . 

For any given staff role in evaluation work, there 
must be an adequate number of staff, and they must have 
the requisite training and experience. Moreover, a work 
atmosphere conducive to attracting good staff and holding 
them must be provided. The Department should examine the 
number and types of positions assigned to evaluatipn 
activities in light of r'3sponsibilities and work load 
(number of RFPs to be prepared, contracts monitored, 
final reports to be analyzed, etc.) within the central 
evaluatibn unit and wherever else evaluation activities 
are carried out. It should also examine the extraneous 
and counterproductive demands that are imposed on staff 
through internal procedures that could be simplified. 
9on3ideration ^f personnel needs should also take into 
account the time required for the type of training 
suggested below. 

* The academic and ea^perience background of personnel 
charged with evaluation responsibilities should be 
examined in connection with the tasks they are required 
to iperfbrm. This applies to staff in program units as 
well, as to staff in the central evaluation unit. If 
necessary, training programs should be cbnducted to 
prepikre*3taff members for the writing of work statements, 
to fai^iliarize therewith new evaluation techniques, and 
to strengthen their knowledge of selected social sciehce 
disciplines. Handbooks should be^ prepared for persons 
^who monitor the substantive aspects of evaluation 
contracts. If federal personnel lack field experience, a 
determined effort should be made to expose them to 
practical situations affecting the evaluation process. 
Short-term field assignments could be used to provide 
national^ of f ice personnel with needed practical^ 
experienqe. 

At the same time, as noted in Recommendation 
program eicecutives and staff as well as other line 
executives outside the units specifically concerned with 
evaluation would benefit from greater knowledge of the 
language of evaluation and how evaluations can be used. 
Program man^agers at the federal level play a variety .of 
important roles in the* evaluation of education programs. 
Program managers often suggest which of the national 
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prr>9ram8 within their purview ought to be evaluated. 
Such decisions reflect a concern for important issues in 
-prpgram delivery and program effects that must be 
translated. into the evaluation questions to be asked. 
Program managers need to provide key questions to the 
evaluation /experts, spell out what they consider to be 
indicators of successful performance, and so on. During 
^^^""^^rae of a study, managers often assume the role of 
co-monitor and may accompany the technical evaluation 
team into the field to assess progress. At the end of an 
evaluation, managers play an important role in the 
interpretation of the results. All of these roles would* 
be significantly improved if managers had a better 
understanding of the basic principles of evaluation. 
Training for federal staff on relevant topics should be 
instituted. Seminars in evaluation methodology and in 
applications of social science research to program 
improvement could be given by technical staff from the 
central evaluation unit and the Department's research arm 
and by external evaluation experts, a newly created 
training unit within the Department, the Horace Mann 
Institute, provides an appropriate internal vehicle. 
Other alternatives include specially tailored offerings 
by the Federal Executive institute and the Graduate 
School of the Department of Agriculture (which is 
scheduled for transfer to the Department of Educatioi . 
In addition to providing some technical knowledge, 
training should increase the understanding of program 
managers about what kind of information evaluation can 
and cannot provide. 



Process Constraints 

In a number of ways, the Department!^ own procedures 
inhibit its ability to produce timely and relevant 
evaluation studies of high quality. These procedures 
affect each stage of the proces^: producing a coherent 
set of plans for the whole Deparftnent, designing 
individual studies, procurement, launching the study once 
a contract or grant has been awarded, monitoring its 
progress, and disseminating its findings, a typical time 
chart for a relatively straightforward study that is 
intended to take 12 working months for design, data 
collection, and analysis is pictured in Figure 6: under 
current conditions, a lead time of 3 years is necessary. 
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Planning and Design 

In recent years, the Department and its predecessors have 
tried to introduce planning raechanisroa that *#ould help 
set priorities and achieve greater coordination (see 
Appendix A) • One unfortunate consequence has been to 
delay approval of studies, as illustrated by the 1980 
procurement schedule (see Chapter 3, Table 1). Delays in 
the planning process may create poatponeraent of studies 

* Jl®*'„ fiBcal year « An ev en mo re adver se effect 

(also noted in Chapter 3), has been the unwarranted 
compression of time for the most difficult intellectual 
work: design of a study v:y federal staff and by 
responding proposers. The planning process is under the 
control of the Department; presumably, as planning 
mechanisms become better established, time delays can be 
reduced. 



Procurement 

The procurement process or any alternative mechanism for 
getting the work done entails negotiations within the 
Department between the unit designing the evaluation and 
the relevant program unit (if the study is not conducted 
there) as well a3 other parties at interest, for exs^^ple, 
the Office of Civil Rights, the offices of the 
Undersecretary or the Secretary, tb© Assistant Secretary 
for Planning and Budget, or the National Institute of 
Education. In selected cases— for example, in Title i 
evaluiitions in which the legally constituted advisory 
council participates — outside groups are also involved. 
(We note that our recommendations in Chapter 3 with 
respect to opening up the procurement process in order to 
enhance the quality of evaluations will further 
complicate the process and may introduce additional time 
lags.) A major party to such negotiations is the Grant 
and Procurement Management Division, which must approve 
all procurement instruments or grants announoftments. The 
federal competitive procurement processes as interpreted 
and enforced by this division take, on the average, 6 
months from review of the statement of work prepared by 
the initiating office to the time of award, exclusivifi of 
response time allowed between announcement of RPP or 
grant guidelines and the proposal due date. 
Noncompetitive processes, such as sole-source awards or 
unsoXf-.ited proposals, can be completed in shorter time* 
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but they are seldom employed because they are more easily 
subject to the charge of favoritism. 

The objectives of competitive procurement are to get 
the best buy for the evaluation dollar and to assure a 
fair process,^ As the competitive procurement 
mechanism now operates, neither objective is likely to be 
.^ttaii . only a few performers are able to compete, and 
>i costd of evaluations are increased by the 
nsiderable — though hidden — costs of the process 

reparing RFPs, writ inq leng thy proposals ) tha t a re 

LUilt into internal staff salaries and the total costs of 
the resulting contracts. At the same time, the losses 
that result from the process are considerable: 
limitations on creativity and quality, time delays, and 
wasteful use of human resources inside and outside 
government. Though the way the government obtains 
research services is generally regulated by statutes that 
pose external constraints, any federal agency has 
considerable latitude in its interpretation of applicable 
regulations. Differences in operating procedures are 
readily discernible to individuals familiar with several 
agencies. The Department of Education would profit by 
examining the more flexible strategies of other agencies. 



Launching a Study 

For any study that involves collecting the same 
information from nine or more respondents, 0MB clearance 
(which may be delegated) must be obtained. When this 
requirement was first instituted by 0MB, there were three 
reasons: to assure adherence to statistical standards, 
to allow 0MB to judge the economic impact of a proposed 
study, and — most importantly in recent years — to reduce 
the burden on respondents imposed by the multiplying 
demands for data. Reduction of the response burden 
remains a major objective for both the administration and 
Congress. As more and more data collection efforts in 
education became subject to clearance (e.g., program 
report forms, statistics gathered by NCES, all evaluation 
and research studies resulting in information to be 
delivered to the government), the Education Division 
within HEW set up its own internal screening mechanism, 
the Educational Data Acquisition Council (EDAC) , to 
facilitate 0MB clearance. In parallel, the chief -state 
school officers, concerned with the time and money 
consumed by responding to federal data requests, also 
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obtained the right to clear study designs and instruments 
hiough their Conunittee on Education Information Systems 
(CEI3) . The 197B education amendments (P.L. 95-561) 
created the Federal Educational Data Acquisition Council 
(VEDAC* the successor to EDAC) as the designated body to 
replace 0MB in controlling demand for data in education, 
with CEIS as an official participant. Ab noted in 
Chapter 3, the 1978 amendments also introduced the 
requirement for notification and availability by February 
15 of data collection instruments to be used in the 
fallowing school year. The effects of ~tlie~creafan^^ 
provisions are illustrated by the following examples. 

A contract for a study on sex equity in vocational 
education, mandated by Congress, was awarded in late July 
1977.° By early December, with concentrated efforts by 
the contractor and the federal project officer, th'?* forms 
clearance package was sent to the OE clearance officer 
who had the job of reviewing submissions to EDAC. The 
clearance officer sent the package forward 2 months 
later, in early February 1978. EDAC clearance was 
Obtained .n March 1, and the package was then forwarded 
to the Assistant Secretary of Education whose clearance 
was needed before submission to 0MB. This clearance was 
obtained on March 22, and 0MB clearance, the final 
hurdle, received on April 14. Because the study had high 
visibility and because there were relatively few 
instruments involved, clearance took 4-1/2 months, close 
to the minimum time averaged during that period. There 
were, however, important changes in instrumentation: a 
major questionnaire dealing with attitudes was eliminated 
at the stage of 0MB clearance (as were most such items in 
other types of instruments). The ostensible reason for 
the deletion was that the legislation did not require 
collection of that type of information. In this way, a 
review of 3 weeks overrode the work of 4 months — which 
included extensive consultation with parties at 
interest — by the contractor and the project monitor. 

Another example concerns a planned study of Indian 
education scheduled for completion in order to feed into 
the reauthorization process for the legislation, due to 
expire in 1983; hence, the study results should be 
available for hearings likely to be held in 1982. 
Approval for the study was not received from within the 
Department until May 1980; an award was made on September 
30, 1980. Even more than for the sex equity study, the 
choice and design of instrumentation will have to include 
careful consideration of the sometimes conflicting 
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sensitivities and points of views of the populations 
being aetved, the service providers, and the fram«rs of 
the program — both legislative and executive. But the 
previously noted requirement for February 15 notification 
and availability of instruments means that there were 
then only 4 months available to flesh out the design of 
the study, including methods, measures, and specifics of 
data collection, and for getting the whole package 
approved through the clearance mechanisms. IL the 
February 15 deadline cannot be met, either a waivezr will 
have to be obtained or the study postponed for a whole 
year. Not only will postponement add considerably to its 
cost, but it will make the study irrelevant to the 
purpose for which it is being undertaken, since data 
collection could not even begin before fall of the year 
(1982) in which the congressional hearings are to be held. 

In another case, a recent 12-month study of OE 
evaluation projects, clearance procedures had not been 
completed by the time the study was done and the contract 
had ended. The choice was to delete the data collection 
aspect of the study or to proceed in the absence of 
required clearance. The first would have led to a year 
or more delay in the study, the second to illegal 
procedures . 

Carter (1977) describes two other examries. For the 
sustaining effects study of Title I, a very complex study 
using 10 different r.ypes- of me.jsures, cl«»&i:ance of th'^^ 
first 2 of the 10 rets nv:»£:sjres tc^k B months > The 
clearance packaces r'or all ^0 a^ts oi ins iriimenl.a 
totalled 1,412 .cages; Car' jr's estimate of che cost for 
the clearance p rocess (not including developfTver t ol: the 
instruments) was 3155,500 (in 1976 dollars). The .'iecond 
example involved i congiressionally mandated dtvHv of 
Title I service? for neglected or dciir.quent cui Jren,; y 
clearance took 6 mc^th*?. Carter notest (IS '/ill); 

Almost withoi:t except on - reviewers, ither at 
OE or 0MB, had never been to an iftsti^^.ution for the 
neglected or delinquent. Many of tb r were not 
aware of the re«u\t^ of our olinicAl pr tests, yet 
they felt they ki^i^w 1-Jw and in vl ^t io. the 
material should bi> collected. Again, 
of f ice-generat^:Td e.tpijrtise si\.>erceded actual f^eld 
experience. 

The last example tUat we citf. pzov' jes an interesting 
illustration of how th dri*\- toward reduci >g 
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respondents' burden haa created a lack of information in 
an area that would appear to be directly relevant to the 
federal role in education, Larson (1980) recently 
studied the collection of race, ethnic, and gender data 
on participants in fedeval education programs. 
Collection of such data is rare except in those cases 
were apecUic populations are targeted, for example in 
ESAA idesegiegation distance) and bilingual programs, 
for which information on the targeted group is 
collected, other except Icaa are research studies not 
directly coupled to ap*^r'rfic program evaluations, such as 
the National Assessment of Educational Progress (1978). 
yet given the overall mission of the federal education 
programs to increase equal educational opportunity, it is 
somewhat surprising that programs as a whole are not 
evaluated with respect to their effectiveness in 
improving education for ethnic or racial minorities and 
females. Recently, regulations have been changed to make 
possible the gathering of data on race, ethnicity, and 
gender in grantee applications for funds, but the 
gathering of such data for program assessment has always 
been possible. That it is still largely absent can in 
good 'Part be ascribed to budgetary and clearance 
constraints, which drive any evaluation study toward 
collecting only those data for which there is an express 
"need-to-know." And "need-to-know" is often equated with 
specific mention of a subgroup in legislation for the 
program or its evaluation.^ One can only conclude that 
current clearance procedures, whatever other purpose they 
may serve, have had the effect of minimizing the ability 
to obtain information crucial to meeting federal goals in 
education. In part, that effect may have been the result 
of considering each study in isolation as it went through 
the clearance process and attempting to minimize response 
burden case by cas' We note that the process is in the 
midst of change. 

At this time, l , intent (expressed both through 
executive action by 0MB- and through proposed legislation 
in Congress) is to manage the reduction of rerponse 
burden more like fiscal budget allocations i each agency 
submits to 0MB an information collection budget, that 
requests an allocation of the total number of burden 
hours necessary to carry out its management, evaluation, 
and research responsibilities. On the basis of the 
submission, an allocation will be made by 0MB, probably 
with a. 10-15 percent cut in response burden, a goal 
annouu ad for 1981. (Another cut is to be made the 
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following yei for a total out of 30 percent over 2 
years.) The agency will then reallocate the information 
collection budget internally. In the case of the 
Department ot Education, of 8-1/2 million burden hours 
that were requested, some 7 million hours, or more than 
80 percent, is allocated to program administration and 
compliance, that is, information to be submitted by 
program applicants and grantees, information needed for 
fiscal audits, and information needed to enforce 
compliance with rights laws. 0MB will delegate the 

responsibility for clearance of specific studies and 
instruments to the agency's internal mechanism when it is 
deemed to be functioning well or the law so specifies, as 
is the case with FEDAC.° 

The evolution of clearance procedures from reviewing 
individual studies to a process that assembles all 
proposed data collection in one document should allow 
top-level Department officials to consider the data needs 
of evaluation and research in a forum where they are 
presented together with tho5ie of pragratn-adminiatration , 
enforcement (for example, the data needs of the Office of 
Civil Rights), auditing, and the periodic gathering of 
general statistical data and indicators (for example, the 
data collected by NCES) . It may also encourage the 
coordination of studies across organizational units so 
that studies proposed by one unit can use data collected 
elsewhere. The Department should be alert to the 
opportunities for more coherent evaluation and data 
collection .activities offered by the new clearance 
process. 



Progress 

After clearance, time delays in the progress of a sUudy 
will be occasioned by the inevitable discrepar Ves 
bet^ween assumptions in the study design and 
conditions in the field. The nature of the y.rrj'^c.-^' 
activity, the individuals engaged in it, the vi \ ' nr*^^- > 
of respondents to cooperate, the presence of 
documentation — all will present unforeseen i/Af £ii:r\tie8,. 
particularly if the timing of the study is Vu\ ovi off 
schedule by the clearance process. Other delays may be 
introduced by the researchers themselves, who* are wary of 
potential criticism and therefore employ time-consuming 
procedures to assure^ technical impeccability that does 
not enhance the quality of the study (e.g., by^ meticulous 
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but unwarranted cleaning of data seta). Federal nvonitora 
are often not in a position to know whether ouch 
procedures are necessary or which delays in the progreaa 
of the study are legitimate. Sanctions against 
contractors who do not deliver products on schedule are 
seldom enforced since extenuating circumstances can 
always be cited. This is particularly true because of 
the inability of federal monitors to respond in timely 
fashion to simplsi much less to complexi requests for 
changes in the study plan, either because of their 
workload or because fhey do not have authority on their 
own to rule on the requested change. Hence, delay 
becomes no one's responsibility. 



Dissemination 

Within HKW in recent years, dissemination of study 
findings has been held up in the Secretary's office for 
many montns because of the perceived need to have the 
Secretary informed and able to respond to inquiries from 
the media and the public. For example, for the study on 
sex equity in vocational education (referred to above) 
the findings were not released until nearly a year after 
the final report was submitted in April 1979. The delay 
appeared to be occasioned by the controversial subject of 
the study rather than by the findings themselves, since 
no changes were made in the final report (Harrison and 
Dahl 1979) . 

The advent of the new Department of Education brought 
about new rules: a directive on release of findings 
(U.S. Department of Education 1980a) provides 10 days, 
after acceptance of the study report by the central 
evaluation unit, for response from program and other 
offices. Reports are to be released after the 10-day 
period, accompanied by the comments received. However, 
this rule does not deal with delays occasioned by 
disagreements between the sponsoring office and the 
performers or with release of findings by sponsoring 
offices other than the evaluation unit. For example, one 
study report submitted in January 1980 (David 1980), 
whose findings were in dispute between the sponsoring 
office (the Assistant Secretary for Program Evaluation in 
the former HEW) and the then central evaluation unit for 
the Office of Education, had still not been released 10- 
months later. Congress in particular i ^ been concerned 
with such delays: on occasion, the suspicion has arisen 
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that eindinqa were not being released becauae they did 
not support the positiona of the current administration 
with reapect to the program that was the subject of the 
study. 

In summary, process constraints have become severe in 
recent years. It is not unlikely that, during the time 
it takes to complete a study, conditions in the field or 
policy concerns regarding a specific program will change, 
making the findings of the evaluation, when they do 
become available, of little interest. 

Recommendation C-5. Congress should authorize a study 
group to analyze the combined effects of the legislative 
provisions and executive regulations that control 
federally funded applied research . 

Congress has been dissatisfied with the lack of 
relevance and timeliness of much evaluation work in 
education. One of the causes for delay and for 
irrelevance is the accumulation of rules and regulations 
governing the relationships between sponsor, researcher, 
and action site or agency, i.e., the Department of 
Education, the contractor, and the state/school/student. 
The whole process of funding and carrying out applied 
research about social services is severely constrained by 
these rules and by the operating precedents they have 
engendered. Almost every provision now on the books or 
enforced through executive practice may be justified when 
considered in isolation: to prevent favoritism in 
contract awards, to protect respondents from a heavy 
burden of requests for data, to protect the privacy of 
individuals, to require disclosure of information related 
to the public business, and so forth. Their combined 
effect, however, has been to lengthen the time needed for 
compliance, to increase the costs both within government 
(through greater investment of staff time), and of 
extramural contracts and grants, and to discourage whole 
cidssert of potential performers from participating. 
Though laws sometimes specify time limits for procedures, 
(e.g., for 0MB clearance of data collection instruments), 
they are seldom observed in practice. 

To dato, moat of the concern has been with instituting 
procedures to guard against possible transgressions in 
initiating and carrying out applied social science 
research. The trade-offs between the benefits of such 
safeguards and the obstacles they create to producing 
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timely and relevant applied reaearoh at reaaonable coot 
have been largely ignored. It is not clear how much of 
the negative effect is due to the lawe and regulations 
themselvea and how much to the interpretation and 
operational mechanisms within any given agehcy. For thia 
reason r the recommended analysis must go beyond the 
problems within a single agency or department and examine 
the process as it works in several different agencies. 

Recommendation P^IB. The Department of Education should 
take steps to simplify procedures for procuring 
evaluation studieSf carrying them out, and disseminating 
their findings . 

The Committee has recommended (see Chapter 3) that the 
means by Vhich the Department solicits, selects, and 
funds evaluation studies be expanded in order to. allow 
more performers to participate. The competitive 
procurement process involving issuance of an RFP and 
awarding of a contract to the highest-ranked or 
lowest-priced bidder is by far the most commonly used 
form of solicit^4tion. This type of solicitation was 
designed by the government for the purchase of highly 
specifiable goods oj: services so that contracts could be 
awarded on the basis of the best buy for the dollar. The 
rules that have accumulated over the years to ensure fair 
competition have shifted considerable control of the 
process from the technical specialists (for example, in 
the evaluation unit or in a 'research office) to the 
contracting office, the interpreters and enforcers of the 
government procurement regulations. This has had serious 
implications for the quality of evaluations (discussed in 
Chapter 3) and has increased the time needed for. arriving 
at compromises acceptable to all. The process has become 
not only restrictive and inflexible but very costly in 
internal staff time and for potential contractors. And 
since the cost to contractors is recouped eventually from 
the government through overhead and ia other ways, the 
government bears the double burden. 

Recent criticisms (U.S. General Accounting Office 
1980a, Gup and Neumann 1980) have focused on abuses 
possible in the use of consultants and sole-source 
contracting. The Committee is not convinced that the 
cost of rifles instituted to prevent such abuses is not 
higher than the cost of the abuses themselves. Th^ 
various means (other than competitive procurement through 
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WPPa) that can bo uauU to obtain evaluation work are 
cilacuaaed In ChaiJtor 3 (aoe Uecoimnendation 0'-^) . The 
Department muat be more deliberative in ahooning whether 
to uue competitive proaurumonti uole-aouroe contracting, 
a-A contracting, cooperative agreementa, baaic ordering 
agreementa, or grant awarda, within the limitationa of 
the law (aee P.L. 95-224) . 

The major sourooa ot delay, once a contract or grant 
for a atudy has been awarded, muat also be identified and 
addressed. Thia applies particularly to clearance 
procedures and to the in-houae handling of requests for 
changes in study design, sampling procedures, testing, 
analysis, time frames, and the like. While a request for 
a modification is being considered, the evaluation may be 
in a hold status, pending the sponsor's response. In 
such cases, the sponsor's nonresponsiveness can 
contribute materially to delays in project completion, 
with the effect of cost overruns. 

At times, failure to perform on tiiAe is the 
responsibility of the contractor or grantee. The 
Department should institute and enforce sanctions and 
incentives to encourage timely performance. For example, 
some agencies have included clauses in contracts that 
prov^lde that nontimely performance (products not 
delivered by thf> specified date) can be a basis for 
nonpayment of up to one-third of the contractor's fee. 

Most contracted evaluations have provisions for review 
of delivered products by the project officer, which often 
may entail extensive internal review and clearance. To 
the extent that these reviews are not completed in an 
efficient and timely manner, the projects are subjected 
t > time delays. Such delays may be as injurious as 
budget overruns, leading to delays in dissemination of 
findings and charges of lack of timeliness. Because of 
the possible cost of such- delays. Recommendation D-13 
(see Chapter 4) seeks to limit the period of control over 
evaluation results. The Committee is not advising 
against review: quite the contrary. It Is advocating 
that the time- taken for internal review^/be shortened in 
favor of making findings' freely available to stand the 
test of the marketplace. In the long run, this will both 
increase the quality and improve the chances of 
appropriate use of evaluation results. 
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NOTES 



There are oHoeptiona. Political appointoea given the 
»Job of reducing the budget will have reaaona to Clnd 
reduced needa. 

At preaent, the office of Civil Righta (OCR) la 
funding a atudy to review teating and evaluation 
Inatruments uaed with handici^ipped perspna and another 
, atudy to identify the factors that oauae ' 
overrepiesentation of minority children in programa 
for the mentally retarded. OCR has also funded 
coat-benefit analyaea of programs mandated under civil 
rights legislation (O'Neill 1976). 
For fiaoaX 1980, the budget for the Department of 
Education was $14,2 billion. For ESEA Title I, the 
1980 budget provided $3.2 billion; for Education for 
the Handicapped, $1.05 billion, and for riehabilitation 
Services and handicapped Research $932 million; for 
vocational education, $928 million; for impact aid, 
$825 million; for emergency school aid, $249 million; 
and, for bilingual education, $167 million. 
As Mn example, when the National institute of 
Education was under an edict from its governing body, 
the National Council for Educational Research, to 
increase the percentage of funds spent for basic 
research, it shifted its labeling of certain 
activities from "evaluation" to "research." Since the 
boundaries are often fuzzy, this kind of redefinition 
is not infrequent. As a counterexample, nearly $1 
million allocated to the evaluation of Title VII 
(bilingual education) were reprogrammed in fiscal 1980 
by the former Assistant Secretary for Education in HEW 
to support further development of "Villa Alegre" (the 
bilingual. analog to "Sesame Street"), a decrease of 
more than one-third in the actual evaluation budget, 
though reporting figures stayed unchanged (see 
Appendix A). 

"Best" has different connotations in different 
instances: it may mean the lowest-priced proposal of 
those technically acceptable; it may mean the 
lowest-priced- proposal of those exhibiting high 
degrees of excellence; or it may mean some combination 
of these and other criteria spelled out in the RFP. 
The information on this study was provided by Robert 
Maroney and Dorothy Shuler of the Office of Program 
Evaluation, the central evaluation unit. Their help 
in tracing the clearance procedures and other process 
steps is gratefully acknowledged. 
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m h«ir fltudy, Larson giv^B wm ^aUitianaX caA^ana tot 
tho Abtt^naa c)atd on r^icti ami hqk vdclAbUMi auah 
dat4 are doamod to be in:«lovant or dangorauBr thay 
rAlaa caato by irat|uti:in<j laryar a^mplaM, and Uhay «ra 
the oonoern of enCocoement r^ithni: than ot ©valuation 
ataCf, 

PEU^C haa a permanent atafC of eour proCeoalonaitt, 
augmented by three to four proCeaaionala on detail 
from other units or Crom outoide the Department, t'rom 
time to time, however, PKDAC atafC are themaelvea 
detailed for oonalderablo periods of time to other 
duties. Staff shortage haa been a major cause of 
delays in obtaining olearanae* 
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Glossat7 



AKRA 
APT 
AXH 
ARROG 

'ASPB 

ASPIRA 

BGH 

BOAE 

ccso 

CEIS 



CENTRAL- 

(EVALUATION) 
UNIT 



Amwricdo bJduoational HoM««roh Aaaualatiun 

American Foaer^tion of Toachora 

American Inatituteu Cor Research 

American Registry of Research and Related 
Organizations In Education 

Assistant Secretary for Planning and 
Evaluation 

An educational research group oriented 
toward Puerto Rican interests 

I 

Bureau of Education for the Handicapped 
(OE) , now Division for Special Education 
and Rehabilitation Services 

Bureau of Occupational and Adult Education 
(OE), now Division of Vocational aud Adult 
Education 

Council of Cllief State School Officers 

Committee on ^Evaluation and Information 
Systems 

formerly the Office Sf Program Planning, 
Budgeting, anb Evaluation (OPPBE/OE) , later 
the Office ofl Evaluati(^n and Dissemination 
(OED/OE) , now* the Office of Program 
Evaluation (OpE/ED) 
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Qommo tiHmtii the Coalition ot t^p^nlsh-^fipe^kinfj 

Mt^ntAl [{odUh QrqMiUKatipniir now th«^ 
National UoftUtlQn lllap^nin Mantel 

DISTAU A I'uatVlnu progiam toe primacy yradttH 

W U.S. Department ot KUuuation 

lillJAC KrUionUlaiial Data Acquiaitlon Counatl 

l«n)UCOM Ino A private aorporatlon performing 

iuiuaatlonal reuearoh anrl tleveXopment 

EUIC Edurat tonal Reaouroea Information Center 

ESAA Emergency School Aaalatanoe Act 

ESAA-TV i A aerioa o£ televlaion programs aimed at 
i minority group children of aohool age 

ESEA ' Elementary and Secondary Education Act 

FEDAC Federal Educational Data Acqulaition Council 

FNS Food and Nutrition Service of the U.S. 

Department of Agriculture 

FY Fiacal Year 

GAO U.S. General Accounting Office 

GPMD Grant and Procurement Management Divisioh 

(OE) 

HEW U.S. Department of Health, Education, and 

Welfare 

HHS U.S. Department of Health and Human Services 

IDEA Institute for Development of Educational, 

Activities 
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lEP Individual Education Plan (P.L. 94-142) 

IG Inspector General 

ISA Intermediate Service Agency (set up by SEAs 

and LEAs to provide services to LEAs) 

ISD Independent School District 

»JDRP Joint Dissemination Review Panel (OE-NIE) 

I<EA Local Education Agency 

MDRC Manpower Demonstration Research Corporation 

NAACP National Association for the Advancement of 

Colored People 

NCES National Center for Education Statistics 

NEA National Education Association 

NIE National Institute of Education 

NIH National institutes of Health 

NSP National Science Foundation 

Office of Education 

OED Office of Evaluation and Dissemination 

(central evaluation unit in OE) 

0MB Office of Management and Budget 

'^PE Office of Program Evaluation (current 

designation of central evaluation unit in 
Division of Management, ED) 

OPPBE Office of Program Planning, Budgeting, and 

Evaluation (former title of central 
evaluation unit in OE) 

PAC Parent Advisory Committee (Title I, ESEA) 

Public Law (for example, P.L. 94-142, 
Education for All Handicapped Children Act > 
of 1975) 
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nCiD Research and Development 

RDD&E Research I Development r Dissemination, and 

Evaluation 

RDU Research and Development Utilization 

Program (NIE) 

RFP Request for Proposal 

RFQ Request for Qualifications 

SBA Small Business Administration 

SDC Systems Development Corporation 

SEA State Education Agency 

TAG Technical Assistance Center (Title I, ESEA) 

USOE U.S. Office of Education 
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APPENDIX 

A 

Federal Evaluation Activities in Education: 
An Overview 
Elizabeth R. Reisner 



Federal funds support a broad range of program evaluation 
activities in education. Such activities range from 
national studies involving achievement testing of 
thousanlds of students to local assessments of federally 
supported projects in individual school districts. 

This paper is intended to provide an overview of those 
federal evaluation activities that are designed to yield 
information on federal education assistance programs. 
The first section of this paper describes the major 
evaluation activities of each of the organizational units 
making up the former Education Division of the former 
U.S. Department of Health, Education^ and Welfare (HEW) 
and certain other units. Taken together, these units 
constitute the main offices currently conducting 
evaluation activities in the U.S. Department: of Education 
(ED) • Information on evaluation activities of these 
offices is presented in tabular form and contains (1) a 
listing of the m^or federal education programs being 
evaluated by each of the organizational units sponsoring 
education evaluations, (2) a description of each unit's 
principal evaluation objectives, and (3) a rough estimate 
of the fiscal 1980 funds used for evaluation by each of 
the units. 



The author is a senior policy analyst with NTS Research 
Corporation ih Washington, D.C. Previously, she had 
staff responsibility for the review of evaluation 
planning in the Office of Education. 
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The second section of this paper provides anecdotal 
information on federally supported evaluations conducted 
by state and local agencies. The third section describes 
the evolution of the federal role in the evaluation of 
education programs. The final section describes the 
process used for deciding what national studies of 
federal education programs are conducted and what 
questions those studies address. 

Information for this study was collected in interviews- 
with federal managers whose offices are responsible for 
conducting program evaluations in education as well as 
from the worK3 listed in the references, in several 
instances inter:ial memoranda of HEW, the Office of 
Education (OE) i and ED were used as source materials. 
Because the intent of the paper is to present a broad 
overview of the topic, it has been necessary to summarize 
detailed information in a number of cases; the author 
accepts full responsibility for any unintentional errors 
of fact or emphasis that may have occurred in preparing 
the summaries. 

Authority for the De^partment of Education was enacted 
on October 17, 1979, as P.L. 96-88, the Department of 
Education Organization Act. The act permitted a 6-month 
implementation period prior to official start-up of the 
new department. ED was officially inaugurated on May 4, 
1980. In this paper, policies and procedures in effect 
prior to that date are described using the earlier 
organizational terminology {e.g., OE and the 
Commissioner). Current terminology (e.g., ED and the 
Secretary) is used to describe activities occurring after 
May 4, 198G. , 



MAJOR EVALUATION ACTIVITIES 
OF THE HEW EDUCATION DIVISION 

Table A-1 provides summary descriptions of federally 
supported evaluation activities designed to provide 
information relevant to programs administered by the 
former HEW Education Division. The primary offices 
within the Education Division were OE, the National 
Institute of Education (NIE) , and the National Center for 
Education Statistics (NCES) j these offices are now 
organizationally situated within ED. The information in 
Table A-1 pertains primarily to former Education Division 
offices because these are the offices for which 
comparable information was most readily available. 
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Information from HEW's planning office and from the 
office of the Inspector General Is also Included « along 
with data on activities of the U.S. General Accounting 
Office* Although data In the table were compiled In May 
1980, there have not been major changes In the aae of 
fiscal 1980 funds* 

A broad. Inclusive definition of program evaluation 
was used In compiling the data presented In Table A-1* 
It Is adapted from the definition used by Robert Boruch 
in his proposal to OE to conduct a study of federally 
supported education evaluations at state and local levels 
(discussed in the second section of this report) • 
Boruch*s'def inltion, which is consonant with that used by 
the Committee (see Chapter 2), Includes the following 
activities under the heading of program evaluation: 
needs assessments , surveys, and other assessments 
conducted prior to program initiation or review; process, 
or formative / assessments intended to yield descriptive 
information on the composition, organization, or 
activities of a program; outcome, or summative, 
assessments intended to yield information on the relative 
benefits^ costs# and other effects of a program; and 
coat/benefit analyses Intended to draw together 
information on several types of program effects. 

The cc;tegory headings used in Table A-1 are as follows: 

• "Federal office conducting evaluation activities** 
refers to offices implementing evaluations (fox ln-houe« 
efforts) and offices overseeing evaluation contracts (for 
contracted studies). The organizational headings do not 
necessarily reflect offices of equal bur^ucratic rank* 

• **Programs being evaluated" refers to the 
principal federal programs that are being studied* 

• **Main evaluation Objectives** reflects the 
priorities as described by federal evaluation managers in 
interviews for this project and in written statements 
prepared as part of the HEW evaluation planning process* 
The information in the table does not include federally 
supported evaluations conducted by local projects for 
purposes of either self-assessment or fulfillment of 
federal program requirements*- 

• "Federal funds used for evaluation in fiscal * 
1980** comprises estimates reported by evaluation managers 
and described in internal planning papers* Funds used in 
fiscal 1980 are indicated because that is the most recent 
year for which fairly precise estimates are available* 




TABLE A-1 Federal^ Evaluation Activities in Support of Programs Administered by the HEW 
Education Division 



Ffldo ra I O f f 1 Cund ur 1 1 ri< | 



Office nf Ei!ucdti«m ((U;) 



Office of tvaliMtioii 
and i/l3semin<jt, ion (OKI)) 



ii.it Ion fih j<>r^ ivc'M 



f C I r K Vrt hiiu t. i on i n 
Fiscal 1980 

{$ thousands) Spi-cicil Ff«aturt!3 



Eleimmtiiry tuui 



Tit K- I of th- Fh-mrn- 
t.try and S«'fond»iry 
FdiK Mt UM\ Ai-t ( tSKA) , 
i-nuTijoiicy school di*l, 
biliiiqtial «MJuccit. ion. 
Title IV civil riijhts, 
national diffusion nt-t- 
woik, and imt^icrt <iici. 



AssfsamiMit of irnpait of i<rnt)riiin 
stTvicnfi »m stkKU*nta (»'.q.. Tit It* 
I, bilinqual, and fmt*rc|fncy school 
liid) ; descTii t ion of program sor- 
vict»fi, espt'cially with regard to 
bPne f i c i ar i es ( e . (} . , Till - I) and 
classroom activities (e.g., bilin- 
qual) ; provi«Uon of technical as- 
sistanco for the im|*rovomt?nt of 
stdto and local evaluations ((.•.<)., 
Tit 1) . 



Impart, ^.tudlcs play .i do- 
»:r»'asin<j role in overall 
efforts; incrcasincj empha- 
sis on supjxirt to stat<> 
an<l local ('valuation activ- 
ities an«-.'on moasuremont 
of federal procjram imple- 
mentation at the state and 
Itjcal levels. 



Occupational, handi- 
'cat»l»*?d, and develop- 
mental progrdms 



Vt)car.ional eilucation, 
4!ducation of the handi- 
capped, adult: education, 
Indian eilucation, librar- 
ies, educational tech- 
no lotjie*;, and special 
projecta (e.rj., teacher 
centers and hasic ■ 
sk 1 1 1 n ) . 



R<„'sporise to contjressional ly man- 
dated studies (e.q. , vocational 
educat ion, career education, and 
community education); information 
on impact of gervice delivery pro- 
yramn libraries)! exploia- 

tory evaluations, afi described at 
entry for HEW Assistant Secretary 
for Plannincj unci Evaluation («;.f}., 
i)lfted and talented). 



Kicjnificant i>ortinns of 
overall fundinq come from 
reqular prorjram accounts 
arid frohi program adminis- 
trative accounts, at the 
decli=:ion of proqrr-»m mana- 
tjers (e.q., Indian ed' - 
cation and community 
education) . 



Post secondary 
proqrams 



ERIC 



Pogtgectjndary .tjrant and 
loan programG for stu- 
dents and diacrntionary 
c|rant procjrams for insti- 
tutions (c.q., developinq 
institutions and special 
services for disadvan- 
taged students) . 



In student aid, (l) assessment of 
proqram impact as measured by re- 
duction of financial barriers for 
students and (2) improvifient in 
manaqoment of aid programs; in in- 
stitutional aid programs, assess- 
ment of impact in terms of (1) in- 
creased financial stability and 



210 



prograip .jality {e.g., developing 
institutions) and (2) increased, 
enrol Imont rates of disadvantaged 
students (e.g., special services 
for disadvantaged students) . 



Subr J' -il for nKl> 



Hur«f<iu of 



liurrition ffir 



the Huro'iij, t"Ji<r- 
5it.»t<> ijr.intM t(Jt 
t ion «'f h.ir»(.lic-ap 
■ hi l.li.rn. 



t.«Tt'il by Ful f i I itnerjt of rnririil.itt^tl stuily ar,(i 
<'i.jn / rrpcjrtinq obji-irt i ve« in KduCdtion 

tMlucj- of All Hdrulicap{>t'il ChiUlrnn Act 
i « (I'.L. 'M-1'1.!) , 'vith spi'cial attori- 

tiuti to t)\v sttUi' appn><ich<*s jruJ 
prat.-t ifna th^t .in* mnst t»f fiMrr.j vp 
in th»-' i lii'nt i f ItMt i<ir» arU ili'livi-ry 
tif srTvi<-t'S to luUiiU cappi'il thil- 
clii'ti, Curn^nt pr iii'Cts iMclmli^ 



;;urv<>y« of U 
mt«nt i>r..ictit;« 
to hotuUcapi't 



i-.i I stmlcnt 
A C hi Ultrn. 



.scrviL-e 



I'i, f)00 



Kvaluatlori actlvitios 
basf.'d oti multlycar ovalu- 
•It ion plan (l«'Vt'lo|W'd and 
tlistribute<l fttnriwin<j 
.'iKii-tnu'rit. of Wh. 'iA'l'\2, 



Huf'-au of .;'t.ihlt>nt 
fV;n.uii?i.il A^^:•l a an* ♦ 



Fcist 

St u*h-iU:' 

0<luiMt i( 

• jrant :i ) 
■ju.U iint ( 



..ulary 

>H.l 1 0[ 

.m«l Ui 



UJinjji \n Obifut ivi'S similar to tht' stutU-nt 
. basic- aid-ri'latrd ohjt'ctivos of OKD 

[ort'.tnity ^oBt rjt^condary Prix^rams Offict?. 
:iuii (t'.(,., t^xft'pt U'ss L^mphasis nn proqram 
lent imptict anil mom fmphasis on mana;ir- 

mt^nt improvfrntwit ; dped al at t(*n t ion 
to c:(.il Ifctlon and analysis of tlata 
Mort'.shMiry fur adjustitu} tiid formu- 
laH tfj t.tiriji*t ititt»ntlt:'d students, 
whilr rt»dui:irii] inHtanct^s nf frauil 
and abuse of f«?deral funds. 



Bureau nf Oi-tupat iona 1 
and A<luH i:<|uiNSt ion 
(nOAK) 



I'roiirams mtlKir i zrti by 
tin- Vo<;at: i(.ina 1 fMucatitJti 
A<t .mil Adult Krlucatioti 
Ac , 



AHsossm<Mit of L'urrent needs for 
vo>„ational oihication and technical 
assiHtanci» for ntate an<l local 
evaluations of fi'derally supported 
activities. 



Activities carried out j^ri- 
marily by National Center 
for Rest'arch in Vocational 
Education located at Ohio 
State University (total 
fiscal I'JRO funding was 
mill ion) . 



ERIC 
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TABLE A-1 {continued) 



Curie hie t i n'| 



Kv>» liuit ioi) Act ivi t it 



iTO'jr.jm;. h«'iti i Ilv^i 1 u.it <'<l Main F:valu.»fi»ni (;h |<'f. t ivi"i 



IMl low Tltroit jti, ,1 'i\'.u:i' 
t ioti.it y (jfiinr i ro jr.im r < 
tlisa tv.iht <> |> <I ' 111 1- 
vli. n in t h.> fir in.iry 

>JIM.I.-S (K- I) . 



Cur ri'iit Kt r«>st» oj\ i mia fWJr»'j f ln' 
{Ji'livi ry c)f s<Tvlc'.vi to ^■ollnw 
TlirouMli jr.int.c'f'^ thron<jh t h<- ib-v.-l- 
< >pm»>rit III' I'lT tttr nMncM i ti<Uj .tt rir ^i 
far piDjtJct imi'U'munt J t i<UJ . Hif- 
lii'iiM li , now fi'!} '.iTtit ( ■ f rum > ■ v.i I uvi - 
tjon. ( out. i>Mit'il with <lfVi' li tpmitit 
«>f n.'W moiirl-; atui .in.ily^ii'- of v.iri- 
ibb-r. a 1 1 ♦■i;t iti'i im| ■ 1 »mfiit t i on ■ il 
r.. ..ills. 



Ki'dorJl Funds Used 
for" Kvdludt ion in 

(9 t housiands) 

1 ,000 fur I'Vjlu.i- 
lion atKl rffHiMrfh 
roml*iru'<l ( fill I 
1 , 0(10 ust'tl to CI mi- 
\iitf (iy. ^u\A^^ti^\ 
b.'low) . 



!:va Illation «»ml rfs»'arcli 
arti vi t ies hiive l)«'or» 
trans ft'rr.M.1 from OKP to 
liro.jr.m nffici-, as <i rr- 
snlt of r<.":umn\fMul<i<.- on 
from i-xpltir »tory (?v<i I na- 
tion roiuiucte<i by AssiH- 
tant S»'cr<;tary for 
I'!anniii<t iin«l Kvaluativnu 



Sutttot.il f<ir OK 



O' National lur.t. it ur<- of 
K.lucatioii {Hl'i : 



T«'St in I, as t - itm iit , 
anil •'Valu.i". \ n 



ti.\,\>*.\* --'i -^i 11 ly ■ f 
\ .i<.-.^i i< n.i 1 •mIm I >n 



ERIC 



rx'voral small, urban udu- 
■■ at i un IT' "jr ittis ( . 'J , , 
j'li^iti-i X' i'l .jnd citii'S in 
:;«-liooln) ami tp m*ral 
'itati- ami local iniittu':- 
tional |»ro'tramJi. 



l'r«^K|r<»'n ailmln i :tt«>r<«fl by 

Uf h\i: .U;cl, to .1 If'BfJl-r 

• •xtfiit, tbi.' t)i?|>artmfnt 
of I.cjbnr's cmploynMnit 
tralnin'i ^iio'ir.imii for 
ynurifj |'i?o|ili?. 



I in| rov* 'int •! it of 1 oca 1 in 
pr.tfMri' , llirou(jh { 1) 
aitnt'fJ at mi'ot.itKj tu*rtls 
liy lotM I i nst ruct i cm,\ I 
St r.i 1 1 vc porsonn*,'! ( for 



struct intial 
valuationis 
idt-ntifV^'O 
and atlmini- 
sma 1 1 urban 



[■rtKjrainH) , i2) ..iHsifJtancn to Ktatf 
.itnl l<i< M I ■•ducat, j onal personnel in 
improviivj quality of I'V.iluatioris, 
iiiid ( \) r'-nMrch in ovahuit ion 

Asnij!;r.nu'nt of pollci»;3 lin- 
pr<jV(^m('nt in miitcli bPtwc^^n trdinlnq 
actlvltli'5i atul job oi>portunitl<'s) 
an<l mechanisms (o,f). ,''opon plannintj 
proci»HM) urKli?r ly inq the Vocational 
^duration Actj stvulif'S not Int^Midcd 
to ovalOatu current proqram^ imf>aet 
on studeiita. 






I ,000 



i:valuatioi\ <'l forts not 
[jriinarlly criinti'd toward 
improvcmorit in impiiJin^nta- 
t ion of ma jcjr f<'d*'ral «.m1u- 
cation programs. 



Study mandat<?d by Conyross 
in •"he Educ^fcloh Amendment a 
of 1070 J cnmprehensl vo 
study plan submitted to 
Contjress 'St bcqlnnlnq of 
st.udy, follow* 1 by periodic 
reports. Overall cfifort 
patterned after conqres- 
sionally mandated study of 
compensatory education 
1975-78, 



Dissemination and 
femj>rovem«mt of practice 



fJiUjtotal for Hlf. 

Nation'il C«'nti'r for 
Etlirc a 1 1 oi i >; t .1 1 1 s tica 
{MCfS) 



10 
O 



NIE proqrams concerned 
with dissemination and 
improvcmont of practice, 
f!.fj., stato cajiacity 
build inq for dissemina- 
tion, RDU, ERIC, and the 
wumen and minorities 
{•roqram. . 



Ill aihlltion to riMjular 
rihjcat'ion 1 1 iiurvoy.s, 
special t}uaui-c»v.ili:at ion 
a(-t i vit i vr. jr»' ,xh 

« Kv.iluati iiii of NCi:S 
t « > V- h h i I ■ .1 1 a y s i H t , ui t: » • 
til tJMors uf NCl-.S iJata. 

• Validity ritiaUos of 
<mqoirj<i survi'ys (Vocj- 
tifri.il KdiK.-jtion [lata 
Sy;;t»>ri .iiul Hic)ht?r I'du- 
c.-ation Centura 1 Infor- 
iTtation SurV'^y in 

rv r)80> . 

• Ka.st Respons«' Survey 
on polit-y isHiios, as 
r««luostod by 1^)1 icy 

of f icTi'S. 



Assessment of four NIB programs 
to detormint? *if fectivoness of 
approaches to the transfer of edu- 
cational research and ..^evelopm'ont 
tf educational practitioners . 



Ans»'S£3m«Mit of fIc'K.S'.s own cffc<;"tivij- 
ni'fin il^ II) makiruj its data af'(:^■^i- 
sli)l»' and r»!li'vant to usi-rn 

fvx of NCKS t.»!chti ica 1 aHH I h - 
tanL*-) and (2) doHifinin'j and implu- 
mt-nt ituj its surv>v»y3 U' .<! . , va 1 idi ty 
tituiUos) J also, iTOvisiorj of noodn- 
asH<-ssniont-typt.' data on a rapid 
bafiis (3-0 iTn>ntns) for ust' in ' 
pt>l icy makinq. 



Current NIE programs in 
area of Knowledge transfer 
beinq used .3 vehicles for. 
research into alternative 
methods of educational 
dissemination. 



All NCES activities have 
l>oterjtial evaluation- 
rolatotl uses, since they 
may pnivide information on 
ne»>d for chamjes or adjust- 
ment t; in federal protjrams. 
Total appropriation for 
NCi:s in fiscal 19H0 was 
$10 million. 



HEW office nf thi' 
Assistant Secretary for 
Planninq and F.v.iluation 
(ASI>E) 



Education Planning 




Kducat ion Division pro- 
fir ama with lanjo fiscal 
out.layH (p.. J., ESEA Title 
I) f>r wit!^ especially 
promiflin*! educational 
apiToaches (e.g., t-'und 
for the, Improvement of ** 
Postaecondary Education) . 



Examination of national iKslicy 
alternatives* often through re- 
analys(*s of tlata collected by 
other atjencies (e.g., OE, NCES|. 
Bureau of the Census) j oversight 
of OE evaluation activities (e.g. 
ESEA Title 1) . 
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Occasional requests for OE 
to conduct specific evalu- 
ation studies. 




TABLE A-1 (continued) 



Federal Offict; Conduct inq 
Evaluation Activities 



Evaluation 



Programs Beimj Evaluated Main Evaluation Objectives 



RepresontativL" sample of 
OE I trot} rams, brokt^n down 
into "lirtjo formula 
qrants," "lartjo discre- 
tionary ij rants," and 
"smal I discretionary 
q rants." 



Federal Fun^s Used 
for Evaluation in 

Fiscal nao 

($ thousands) 



Identification of measurable pro- 
qram objectives and development of 
appropriate measures for use by 
I^cpqram managers in assessinq 
whether objectives are beincj met 
(e.q., Follow Through and billn-iual 
education) . 



Special Features 



ExiUoratory evaluation 
approaclf bcinq used for 
sope studies conducted 
by OE. 



Subtotal for AKPE 



HEW Offlc<.< of the 
• Inspector General (IG) 



ERIC 



OE proqrams with large 
fiscal outlays (e.g., 
ESEA Title I, and jxjst- 
secondary grant and loan 
programs) and programs 
with legislatively man- 
dated audit; requirements 
(e.g., vocational educa- 
tion state grants) . 



Auditing of activities at federal, 
state, and locals-levels to deter- 
mine (1) adherence to principles 
of sound fiscal management and 
(2) compliance with pertinent legal 
requirements (e..g. , Title I roquiri- 
ment that federal dollars must sup- 
plement and not supplant state and 
local spending on target children) . 



800 

: :,ooo (200 staff- 

) ears, estimated 
at $50,000 per 
staff-year) . 
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Planned HEW/IG activities " 
for fiscal 1980 reportedly 
canceled in anticipation 
of new IG for Department 
. of Education. 



General Account Inq 
Office (GAO) 



OE programs or pro<jrain 
components believed to 
have serious management 
problems (o.y., develop- 
ing institutions, stu- 
dent aid eligibility for 
proprietary institutions, 
and defaults in the 
guciranteed 3tudc;nt loan 
program) or unclear pro- 
gram obiectives (e.(j., 
Follow Tlirough ond bl- 
lin(ju»il education); also 
programs comimj up for 
reauthorization in 
C<in<jre3s. 



Assessment of the federal admini- 2,500 (50 staff- 

stration of educational programs years, estimated 

and evaluation df program impact at $50/000 per 

oja.' intended beneficiaries. Studies staff-year), 
focused'o'n generating program rec- 
ommendations for Congress and for 
relevant federal agencies. 



Profjrams selected for re- 
view according to requests 
from members of Congress 
or GAO staff. 



Subtotal for ail offices 
except I a and GAO 



3 1 . 4 3(1 
A I,') 30 



20 4 

• "Special features" contains miscellaneous 
information relevant to evaluation activities of several 
of the offices indicated* 

Among the categories of information presented in Table 
A-1, the category most vulnerable to change is the annual 
funding data. These amounts are subject to considerable 
fluctu<|tion within any given year because of decisions to 
move funds into or out of accounts previously designated 
for evaluations and because of different interpretations 
as to whether a given project is an evaluation or a 
research activity. An example of the first type of 
fluctuation was the decision early in fiscal 1980 to 
transfer funds out of the "line item** appropriation for 
studies and evaluation of bilingual education programs in 
order to fund a bilingual television project. A total of 
$700f000 in OE funds for federal program administration 
was designated to be used to replace the transferred sum, 
but because of high expenses associated with implementing 
the new EDf the bilingual evaluation funds were not 
replaced. An example of the second type of fluctuation 
can be seen in NIE*s reports of its own program 
expenditures. Because of an administrative decision to 
allot the maximum amount of"NIE*s funding to research 
purposes, the Institute intentionally labels very few of 
its projects as evaluations, even though many have 
characteristics that conform to the definition presented 
above . 

The aspect of the table most likely to provoke 
questions from readers is the ..inclusion of federally 
conducted audits of federal, state, and local 
implementation of federal progra{nsv ^Audits are generally 
not considered to be evaluative in nature, especially 
since they usually focus only on the fiscal operations of 
individual federally funded projects. In recent years, 
however, federal audits have become increasingly 
concerned with nonfiacal matters, particularly state and 
local compliance with legislated objectives and 
procedures. The adoption of this auditing focus has 
resulted, in some instances, in a blurring of the 
distinction between audits and evaluations, particularly 
given the estab,li8hment pf specified national 'priorities 
for federal education audits. For example, the fiscal 
1980 work plan for the HEW Office of the Inspector 
General identified three priorities for audits of- state 
and local administration of Title I of the Elementary and 
Secondary Education Act (ESEA) : (1) compliance with the 
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Title I statutory requirement for annual maintenance of 
local fiscal effort per pupil; (2) implementation of 
Title I state requirements for monitoring and enforcement 
plans; and (3) operations of the centralized Migrant 
Student Record Transfer Service funded under Title. I. 
With the establishment of explicit compliance-oripnted 
auditing objectives such as these, federally conducted 
audits have acquired a distinct resemblance to program 
evaluations. 



FEDERALLY SUPPORTED EVALUATIONS . 
CONDUCTED BY STATE AND LOCAL AGENCIES 

Virtually all federal education aid programs require 
institutional grantees to conduct evaluations of their 
own performance. The specific language of the 
legislative requirements varies among programs, depending 
on the overall objectives of the program and also on the 
evaluation methodologies considered by federal 
administrators to be best suited to the particular 
program. For programs with a large state administrative 
role, such as ESEA Title I and the state grant program 
under the Vocational Education Act, st'ates are'also 
required either (1) to collect local evaluation data and 
provide suminaries of these data to ED on a tegular basis 
or (2) to carry^out their own state-managed evaluation 
efforts. 

In recent years congressional mandates and Education 
Division program managers have identified state and local ' 
evaluaton priorities with increasing specif icity, but the 
offices of the former Education Division do not at 
present collect regular data on the implementation of 
state and local evaluation requirements. Thereft^re, it 
is not possible to determine what portion of ED program 
^grant funds are used by grantees for self-evaluation' 
purposes npi[ is it possible to determine exactly how 
thos^ funds- are used* It is apparent, however, that 
significant amounts of federal funds are used to provide 
assistance to state and local agencies ift improving the 
quality of their evaluations. 

Evaluations conducted by state and local agencies are 
generally funded using program grant funds. At the state 
level, evaluation activities are supported using state 
administrative funding provided by the pertinent federal . 
program* ESEA Title I, for example, provides eaqh state 
educational agency with 1.5 percent of the state's total 



206 



Title I funding for purposes of state administrative 
activities, including Title I program evaluation, in 
school year 1979-80 r amounts available for Title I state 
administrative activities, includincf evaluation, ranged 
from $4.5 million in New York to $225,000 in the 14 
states with the lowest Title I enrollments. Other 
federal education programs also provide administrative 
funding to state education agencies. 

At the local level, evaluation activities must be 
supported out of each school district's federal grant 
funds. . The district's grant application usually 
describes the evaluation activities planned by the 
district and indicates how much of its grant is proposed 
to be used for evaluation purposes. That proposal is not 
generally binding on the district, however, once the 
federal girant is received. (For more detail on the 
funding and management of local evaluation activities > 
see Appendix C.) Examples of the. types of state and 
local evaluation activities carried out under three 
federal education programs are described below. 



ESEA Title I 

As a result of a requirement contained in the Education 
Amendments of 1974 (P.L. 93-380), OE developed^a set of 
local evaluation models for use by Title I grantee The 
models, as specif&d in federal regulations (45 C*^ 116.7 
and 116a. 50-5.7 published in the Federal Register 
October 12, 1979) , 'provide methods for measuring ^ it 
achievement gains in reading, mathematics, and lati. ja^ 
arts; ED (and formerly OE) also provides technical 
assistance (at a cost of $11 million in fiscal 1980) to 
state education agencies on methods for assisting local 
districts in the use of the' models. Despite extensive 
efforts by OE since 1974 in desiring and implementing 
the models. Congress has expressed concern in committee 
reports for the Education Amendments of 1978 (P.L. 
95-561) that the Title I evaluation models do not yield 
data that can be used by local Title I administrators as 
a basis for improving Title I projects (U.S. Congress 
1978a:51, and U.S. Congress 1978b: 29-30) . Findings in 
support of this view have also been presented by David 
(1980) and Orland (1980), but they are contradicted by 
statements of t;he ED evaluation office. 
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ESEA Title VII 

The-Education Amendments of 1974 also mandated that 
evaluation models be developed for use by local districts 
receiving funds under ESEA Title VII, the Bilingual 
Education Act. The Education Division did not 
immediately implement that mandate, however, and it was 
' reiterated in the Education Amendments of 1978. The 
Senate committee report on the J1978 amendments expressed 
hope "that these guidelines will provide scientifically 
valid information as well as describe the unique features 
pf e^tih project in order that local level projects can be 
valid}^ compared" (U.S. Congress 1978bi69). The ED 
evaluation office is currently overseeing a project 
Intended to yield evaluation models for use by Title VII 
grantees. In early descriptions pf the project, the 
.evaluation office has stated that the models are to be 
designed on the basis of existing approaches (including 
tbe current Title l evaluation models} and e^re not to 
reflect any new or "basic researchv" i 

As in Title i, the Title VII program also funds / 
technical assistance providers who «re expected to assist 
local Title \(II grantees in improving the quality of 
their self-evaluations. Until thie evaluation models are 
ready, however, ^grantees and assistance providers have 
relatively little guidance on which to base local 
evaluation efforts, except for criteria in the Title VII 
final regulations requiring attention to "data collection 
instruments and methods," "data analysis procedures," 
"time schedulej^,* and the like (45 CFR 123a. 30(e) * 
published in the Federal Register on April 4, 1980). 



Vocational Education ^^^-t-^ 

The Bdocation Amendments of 1976 (P.L. 94-482) 
establis^ted a comparable set of requirements for the 
Vocational Education Act. stateis are required to use 
"statistically-valid sampling techniques" to measure "the 
extent to which program completers and leavers (i) find 
employment in occupations related to their training, and 
(ii) are considered by their employers to Be wei^l-trained 
and prepared for employment" (section 112 (b)(1)(B) of 
the vocational Education Act), in addition, the 
legislatively mandated "national center for research in 
vocational education" is to "work with states, local 
educational agencies, and other public agencies in 
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developing methods of evaluating programs i including the 
follow-up studies of program completers and leavers 
required by Section 112, so that these agencies can offer 
job training programs which are more closely related to 
the types of jobs available in their communities, 
regions, and states..." (Section 171(a)(2) of the Act). 
The national center at Ohio State University has prepared 
materials relevant to their technical assistance role; a 
recent list of their activities includes three projects 
aimed at implementing this mandate: "Evaluation Services 
for Education Agencies," "Evaluation Handbooks," and 
"Inreasing the Credibility of Vocational Education 
Evaluations" (listed in Gordon et al. 1979s62-63, 153). 
The NIE mandated study of vocational education is 
currently examining the performance of states in 
implementing their evaluation requirements. 

Studies of State and Local Evaluation Activities 

Despite these extenisive statutory mandates for state and 
local evaluations, the only effort up to now to review 
federally supported state and local evaluations across 
federal programs has been the recent study by Boruch and 
Cordray (1980). That study provides information on those 
state and local evaluation activities aimed at producing 
data relevant to federal categorical programs. There ara 
also three studies (one of which is under way now) that 
provide information on state and local evaluation 
activities supported from a variety of sources, federal 
and nonfederal. 

Survey of large school district evaluation units . The 
Center for the Study of Evaluation at the University of 
California at Los Angeles has examined the organization 
of local school district offices of evaluation. This 
survey acquired data on the size, staffing, and 
organizational structure of evaluation offices in school 
districts with enrollments over 10,000 (Lyon et al. 1978). 

Survey of educational researchers and research 
organizations . Under contract with NIE, the Bureau of 
Social science Research in 1976-78 surveyed nonfederal 
organizations conducting research, development, 
dissemination, and evaluation activities in education. 
Information was obtained on funding, organizational 
characteristics, and activities of 2^434 such entities 
(Prankel et al. 1979) (see Appendix B) . 
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Study of how school dlatricta uae inforroatlon from 
teatinq m<& evaluation « Currently under way through an 
NIB contract to the Huron Institute in Carobridgei 
Massachusetts, this study is intended to develop 
strategies for helping school districts make better use 
of evaluation and test information. Initial reports from 
the study were made available in the fall of 1980i the 
final report is to be issued in the fall of 1981. 

Although each of these studies sheds light on state 
and local evaluation activities in education, none 
provides a comprehensive description of state and local 
evaluations undertaken to assess the operations of 
federal programs^ 



EVOLUTION pP THE FEDERAL ROLE 
IN THE EVALUATION OF EDUCATION PROGRAMS 

Evaluation requirements are a relatively recent addition 
to federal education programs. The first manclatory 
evaluations for an OB program were those carried out by 
local districts implementing ESEA Title I projects. In 
1965 Senator Robert Kennedy introduced language into the 
draft version of Title I requiring that "effective 
procedures, including provision for appropriate objective 
measurements of educational achievement, will be adopted 
for evaluating at least annually the effectiveness of the 
programs in meeting the special educational needs of 
educationally deprived children" (Section 205 (a)(5), 
P.L. 89'-10). Over the next several years local 
evaluation requirements were added to other OE program 
authorities, and by 1970 several OE bureaus had 
designated evaluation coordinators whose role was to 
oversee local evaluation efforts and occasionally to 
conduct small studies at the national level, usually 
relying on OE general administrative funds (under the 
"Salaries and Expenses" account in the annual 
appropriation) for financial support of any contracted 
projects. 

The fiscal 1970 , appropriation for OE contained for the 
first time, however, a $9.5 million line item for OE 
evaluation and planning activities. Also in that year 
John W. Evans was named to head the first OE-wide 
evaluation office to oversee the expenditure of those 
funds. To administer a centralized evaluation and 
planning function, Evans assembled an evaluation staff, 
composed largely of the evaluation coordinators who had 
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baen working at the bureau level, and alao oonsolldated . 
the various other pocketa of federal funds that had until 
then been sources of bureau-level evaluation support. 
After that beginning, the activities of the evaluation 
office grew steadily for the next several years. 

With the legislative creation of NIE in 1972, the 
organisational structure for OE studies of eduation 
programs was altered somewhat. With a few exceptions, 
those OB functions that were primarily research oriented 
were transferred to the new agency. Notable exceptions 
were the research activities carried out as an adjunct to 
the OB program for the education of handicapped 
children. ^ The director of the program argued that the 
res««rch activities^ for the education of the handicapped 
werd so closely related to state and local program 
support activities that handicapped research should not 
be moved to NIE. The OB handicapped office was 
successful in this argument and thus paved the my for 
the 1975 legislative directive in the Education pf All 
Handicapped Children Act (P.L. 94-142) that the me|jor 
national evaluation activities required in the Act were 
to be administered by the.OE Bureau of Education of the 
Handicapped (BEH) and not in the central OE evaluation 
office. 

The move towards decentralization of evaluation 
functions was underscored by language specifying that the 
new national center for research in vocational education 
was to be lodged in OE. This action had implications for 
OE evaluations because the research center was given 
specific responsibilities for developing evaluation 
methods and assisting state and local agencies in 
implementing program evaluations. In the-, trend towards 
decentralization of evaluation activities, it was equally 
important that Congress specified in this vocational 
education statute (Section 160 (a)(1)) that "the 
administration of all the programs administered by this 
Act" was to be the responsibility of the Bureau of 
Occupational and Adult Education (BOAE) . Thus, the 
management of the national vocational research center and 
its mandated evaluation activities were explicitly 
assigned to the OE operating bureau, not to the central 
evaluation office or to NIE. 

The most recent step in this trend has been the shift 
of .responsibility for evaluation of the Follow Through 
program. As a result of a short-term "exploratory" 
evaluation of the program, the OB Commissioner in 1979 
decided to move Follow Through evaluation activities from 
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the central evaluation office to the Pollow Through 
program office. This move waa sought by Pollow Through 
program staff for the stated purpose of making the Pollow 
Through studies more relevant to program operations* 
Undoubtedly* another factor was displeasure of the staff 
with a recent large evaluation of the impact of Follow 
Through services on student developmentf reflecting a 
frequent pattern of program office/evaluation office 
tension (noted in the final section of. thi^ paper) • 

In addition to the handicapped, vpcationali and Follow 
Through evaluation activities, OE's evaluation function 
had been decentrali2.ed in several other waysi even before 
the new ED was created. . The evaluation office, for 
example, has invited the participation of program 
managers in all major decisions affecting evaluations in 
their respective program areas. The evaluation planning 
process, described in the following section, relies 
heavily on the judgments and recommendations of program 
managers. The importance of this consultation is in some 
senses highlighted by the increase in statutory 
set-asidds of annual program appropriations for national 
evaluations. The "Emergency School Aid Act of 1972 (P.L. 
92-318) specified a set-aside of up to 1 percent of 
annual appropriations for national program evaluations. 
Two years later, the 1974 reauthorization of ESEA Title I 
authorized up to one-half of 1 percent of annual Title I' 
appropriations for program eval^ation and studies, in a 
slightly different pattern, the 1974 reauthorization of 
ESEA Title VII established a new "Part C - Supportive 
Services and Activities" to be administered by the HEW 
Assistant Secretai^y for Education. The 1978 amendments 
to Part C authorized studies that are clearly, evaluative 
in nature, including studies of Title VII effects on 
students with language proficiencies other than English 
and of methods for identifying sjtudents to be served by 
Title VII projects. Because the statute assigned 
administrative authority for Part C to the HEW Assistant 
Secretary for Education, the OE evaluation office was 
only one of four offices that has in the past several 
years reviewed plans for bilingual activities; the other 
offices have been the 'OE Office of Bilingual Education, 
NIE (since it is given specific statutory 
responsibilities under Part C), and NCE8 (since it 
conducts statistical studies supporting Title VII) • 
Under the new Department of Education, the Part C 
coordinating function is being carried out by the^Office 
of Bilingual Education and Minority Language Affairs. 
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EVALUATION PLANNING 

One of the most cll£flcult problems affecting program 
evaluation efforts in the Education Division and in ED 
has been determining the best way to identify program 
evaluation needs, ^ The problem is largely one of 
organization. Program managers need to be consulted 
regarding any studies to be done in their respective 
program areas, and in fact the ED evaluation office has 
been consistently careful to ask for the suggestions of 
program managers. Program managers and evaluation 
managers often disagree, however, with regard to 
evaluation priorities for a given program. Program 
managers are more likely to ask for evaluation studies 
that will help them improve existing management tools or 
will enlarge their information about their program 
operations; evaluators tend to be more concerned with 
whether or not a program is effectively me'eting a 
longer-range objective, such as the improvement of 
academic achievement (or college enrollment rates or 
English proficiency) for a defined group of students. 
Program managers may not place a high priority on 
evaluations of program effectiveness because they believe 
-that first-order questions (e.g., "Are the intended 
children receiving the intended program service?") should, 
be answered first or because they fear'the consequences 
of unfavorable answers to program effectiveness 
questions. In addition to this disagreement over the 
purposes of evaluations, another organizational problem 
is that senior-level program managers often simply are 
not willing to take the time to consider evaluation 
priorities at the time that decisions must be made. 

The OE, now ED, evaluation office has addressed this 
need for program consultation by seeking formal 
suggestions for evaluations from program nianagers once a 
year. Through 1978 the strategy was to issue an annual 
request for project recommendations from program managers 
and then to use those recommendations as one factor in 
developing a list of projects to be undertaken in the 
following year. This list was then submitted to the HEW 
Assistant Secretary for Planning and Evaluation (ASPE) 
for final approval. The amount of scrutiny by ASPE 
varied from year to year; generally only the central 
evaluation unites plans were subjected to critical review 
even though other units, such as NIE and NCES, also 
submitted their plans. 
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In 1979 a naw prooedurt was initiated that impoaad 
gctatar top-*laval aontrol over evaluation planning and 
waa intended to mAke plana more reaponaive to oonoerns of 
Congreaa and aenior HEW and BD polioy makera. The main 
£oai of thla attention were the propoaala of the 0B# and 
then BD# central evaluation office^ but the senior-level 
review group convened for the purpose alao reviewed 
fiaoal 1979 evaluation plana prepared by BEH# BOAE# and 
the Bureau of Student Financial Aaaiatance (BSFA) . The 
plana of the central evaluation office^ which received by 
far the major portion of the group' a time and concern # 
were criticised and modified by the group primarily with 
regard to the proposed timing of atudiea and their 
expecte^ cost; in a few inatances plana for impact 
atudiea' were delayed by the group and program needs 
asaeaament projects were suggested to precede impact 
atudiea. ^The group's primary objective with regard to 
timing was that new evaluation studies should be 
scheduled to provide useful program data in time to make 
substantive contributions to legislative debates on 
program reauthorization. Coat considerations entered the 
decisions to reduce the scope of tasks proposed in 
certain studies and to eliminate some tasks from other 
studies. Preliminary studies of program need were 
recommended in instances in which policy questions 
existed about the national need for the type of services 
to be provided by the program under review. The new 
review procedure was also used for 1980. The resulting 
evaluation plan marked the first time that a 
comprehensive OB-wide plan had been assembled. 

An example of the new review procedure in action was ^ 
the group's decision on the proposal of the evaluation 
office to examine the effectiveness of the developing 
institutions program, Title III of the Higher Education 
Act (P.L. 92-318# amended by P.L. 94-482). In that 
action the group decided that it was premature to 
consider the effectiveness of the program in improving 
the financial and educational viability of the 
institutions being funded. The group decided that a 
necessary first step was to identify a set of reliable 
indicators to apply to the financial status of a college 
or university in order to determine the financial 
strength or weakness of the institution under review. It 
was also determined that an "exploratory evaluation" of 
the developing institutions program should be 
conducted.^ The purpose of the exploratory study would 
be to identify practical # usable measures of successful 
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projttQt implomsntAtion* It auah meaaurea aould be 
Idantifiad, it would than be reasonable to 90 forward 
with a larger*-8aale atudyf whiah would-'-among other 
thinga—aQtually meaaure whether or not the developing 
inatitutiona program was being Cully limplemented by 
inatitutiona receiving awards under the program. 

Under BD Secretary Shirley HuCatedleri the 
organisational aetting Cor program evaluation refleoted 
the inoreaaed emphaais on linkages between evaluation and 
program improvement. The central evaluation oCCice in ED 
reported organisationally to the Deputy Asaiatant 
Secretary Cor Evaluation and Program Mant.gementi who in 
turn reported to the Aasiatant Secretary Cor Management, 
The Program Evaluation OCfice was organiantionally 
coequal to the Management Evaluation OCCice 1 which waa 
assigned reaponaibllity Cor management evaluationi 
management quality assurance 1 program aaaeiamenty and 
organisational development. In his statement beCore the 
Senate Human Resources Committee prior to conCirmation, 
John Gabuai, Secretary HuC&tedler^a Asaiscant Secretary 
Cor Management, expressed his intent to i:'iprove the use 
and uaeCulness oC ED evaluations Cor purposes oC 
management improvement in ED progctimsf dorlsions on 
program budgetsi and fulCilling inlormat.ion needs of 
Congress prior to legislative reviews. 

Gabusi's statements and the structure within which the 
program evaluation function was organizaticanally housed 
at that time reflect to a considerable extent the 
priorities expressed in Circular No. A-ll> issued by the 
U.S. OCCice of Management and Budget in Mj rch 1979. 
Entitled ''Management Improvement a d the Use of 
Evaluation in the Executive Branch 1" this directive to 
federal agencies construes program evaluation as a 
component of federal management \jnprovement ^ As stated 
in the circular # "agency evaluation systems . . . should 
focus on program operations and results. Tbey should 
include procedures to assure that evaluation efforts 
result in specific management improvements that can be 
validated" (page 2). The organizational structure under 
Secretary Hufstedler reflected these priorities and may 
have indicated the direction of upcoming ED evaluation 
activity. No information is i^vailable at this writing, 
however, on the program evaluatiovi p^.ans of Terrel Bell, 
Hufstedler's successor as Secretary of Education. 

The evaluation of federal education programs has 
undergone considerable change in the 10 years in which it 
has been a major federal activity. These changes have 




Inoludid Inoceastti In leglalatlvt pirlorlty on •valuatlona 
at ftdacalr atatai and looal lavala. OrganliatlonaXXy r 
wa hava aaan tha fadaral avaluatlon function oantrallaod 
Into a alngla aganoy^wlda unit and than gradually 
daoantcaXliad to aoma dagrae« Tha araatlon ot tha 
Bducatlon Dapactment may be quloKanlng tha paoa of ohanga 
that oharaotarlaaa thla prooaaai Qlvan thaaa 
clroumatanaaar It la aaaantlal that tha dlreotlon and 
charaotar of fadaral aduoatlon avaluatlona be Informed by 
expert; dlBpaaa,lonate analyala of poaalble methoda for 
inoreaalng the utility of federal evaluationa aa a tool 
for improving education. 



NOTES 

1 A second important exception was the policy research 
^ activities carried out by the Education Policy 

Research Centers* At the recommendation of Evans in 
1972, .those cenera (three in number at thi^t time) were 
moved from the 0£ evaluation office to the newly 
created Office of the Assistant Secretary for 
Education in order to support that office's activities 
in education policy development* 

2 A similarly difficult issue has been the utilization . 
of evaluation findings* This issue is addressed in 
Boruch and Cordray (1980) and in the report of the 
Committee* 

3 Such studies were also undertaken in a number of other 
program areas at the instigation of Joseph Wholey, 
ASPE Deputy Assistant Secretary, who had developed the 
notion of exploring the ''evaluability'* of a program 
before full evaluations were done* 
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APPINOIK 

B 

Performars of Federally Funded 
Evaluation Studies 
Laure M. Sharp 



INTRODUCTION AND DATA BASB 

The evaluation olf (federally funded eoolal initiatives in 
education-— aa in health aervioesy orime control » or 
housing progran\a~«-ia seldom carried out by federal 
agenoies. The bulk of evaluation performers are private 
research firmsr academic bureaus, and state. and local 
agencies, which receive federal funds to conduct 
evaluations commissioned by congressional mandate or by 
executive policy makers uOr to carry out evaluations on 
their own initiative with federal support. Although much 
has been written on evaluation methodology and quality, 
on one hand, and on the uses and abuses of the grant and 
contract system under which federal funds are channeled 
to outside performers, on the other, there is no single 
useful data base that provides figures on federal funds 
spent in a given fiscal year on evaluation activities, 
the portion of such funds allocated to outside 
contractors or grantees, and the identification of 
contract and grant recipients. 

Evaluations in the field of education represent a 
large share of. all federally funded evaluation 
activities, probably on the order of one-fifth or 
one-fourth of those activities.^ More specific 
information exists with respect to the performers of 
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eandud by th« e«dttr«l 9ovtrnDn«in^i Avtn in fidUQUblonf 
hawivtri information u not n««rly «« tKt«niiv« ahcI 
ralUblt M ont would niitd Eor « aompreh<inalv« 
i«iMiiintnt. Th« Pfoa«dur« oC pinoing tog^thtr p«l«vftnt 
informAtlon ttom varloua «ourQ«« In iiubj«Qt to a hiqh 
dogrtv of lnipc«Qliilon for aiv«ral. rtaaonai 

• Thtrt 1» no commonly «QQopt<id definition of 
•valuation Aotlvity, In partioulArf the boundAri«a 
bttWMn AVAluAtion And reatAroh Are fiAc from al0Ar**Qutf 
AA diaouAAtd by HAiAn«r in App«ndiH A And by AbrAmaon 
(1978) in hiA work on ttdtrAl funding of aoqIaI roAtroh 
And rtlAted AOtivitiAAt SvAlUAtion p«rform«r« thtmaelvAA 
Are tvtn mort inoonaiRtttnt with rtApaot to th«Ae 
boundAriAA. 

A ThA dAtA thAt Are AVAilAblA Aoldom rofAr to tho 
idAntioAl time ApAn. Yot the volumA And nAturA of 
fAdArAlly funded AyAlUAtion AOtivitiAA in oduoAtion hAVO 
VAriAd oonAidACAbly over the time period (1974-*79) 
coneidered in thie PAPAr. 

A While evAluAtion ntudiee Qommiaaioned by federal 
Agenoiee have been inoceAAingly funded in the form of 
contreota AWArded through the competitive procurement 
prooeaAi work in the evAluation area ia alao AWArded in 
the form of granta and '*aole-8ourae** awArda. In 
additioni exiating oontraota and granta are of tan 
extended and modified, frequently with the addition of 
new funda. Information about theae typea of funding 
itctivitiea ia difficult:' to locate. 

e The prevailing revenue-aharing model under which 
large funda are allocated to atate and local 
juriadictiona on a diacretionary baaia makea it almoat 
impoaaible to eatimate the level of evaluation activltiea 
carried out by theae juriadictiona. In particular, 
ayatematic documentation ia lacking about the extent to 
which auch activities are performed by staffs of state, 
and local education agencies or under grant and contract 
arrangements by outside organizations. While there is 
some discussion in this paper of the evaluation 
activities of state and local education agencies, data 
presented for those sectors should be viewed as 
especially^rough estimates. 

• While many contracts or grants may be awarded for 
the exclusive purpose of conducting an evaluation, there 
are probably many more instances where evaluation is 
merely one component of a project. Thi9 is especially 
true of social experiments and demonstration programs. 




m 

Noab of feUt daten in thin pAp<ir wciro pbfcnin«d fehrpwgh a 
eurvty of purCQ^m^ra of r«iiMrgh «nd riMArqh-r«iUt84 
Mtivititu in i^Mofttion, AmtrlOAn regiafcry of 
rM««r0h ana rtU^cK] organiiationa in aduoation (ARHOB) , 
Tha ARR08 projagt; waa oonduofead from 197fi to 1919 by fcha 
Buraau of ^ooial Solanoa Raaaarqh undar aonfcraqt to tha 
National Inatitwfca of fiiduoation. To oraata a liatinq of 
potantial parfovmlng ocganliationap a variaty of aouroaa 
waa uaadi inoluding voatara of atata dapartmanta of 
adUQAtioni intv^dta^ aduoation aganoiaa, looal aohool 
ayatamai fadacaxgraiftaaa and oontraotoca, and authora of 
artiolaa in 02 partinant Journala, Tha ARROa pcojaqt 
Initialiy idantifiad mora than 6«300 orqaniiationa that 
might maat tha arltaria for inoluaion in tha aurvay^ and 
a quaationnalra waa mailad to aach organliatloni 
Organiaationa that had baan aotiva parformara during 
thiir laat oomplatad fiaoal yaar and wara diatinot 
ocganiaational antitiaa wara oonaidarad aligibla for tha 
aurvay and wara aaked to oomplata tha antira 
quaationnalra. Organiaationa that failad to raapond ware 
qontaotad by taXaphona, andi if aligiblai wara aaKad a 
numbar ofykay quaationa. Of the 6,346 organiaationa on 
the original mailing liat, 81 percent were oontaoted and 
their eligibility establiahed. Of the 9|208 
iorganiaationa with whom contact waa made, data from juat 
/about half '(2#434) were included in the data analyaia; 
/ most of the others were ineligible, frequently becauae 
/ they had not carried out educational RODftB during their 
f moat recent fiscal year. (The derivation of the ARROE 
data base is sketched out in Table B-1,) Slightly less 
than half of the reporting units had returned the 
detailed reail questionnaiires, while slightly more than 
hal^ of the unfts were asked the abbreviated set of 
questions in a "telephone interview. Thus, the ARROE 
survey yielded two data setsi a basic set for all 
organizations for whom some data was obtained (N - 2,434) 
and a more detailed set (N ■ 1,071) limited to those 
organizations that completed mail questionnaires.^ The 
2,434 performing organizations covered by the survey were 
located in 1,530 separate institutions (see Table B-2).^ 

While evaluation was one of the activity areas covered 
by the ARHOB survey, it was not its primary focus. " The 
ARROE staff — in consultation ^ith an advisory committee 
on which the principal types of performers were 
represented — came to the conclusion that in fact most 
organizations that perform research and research-related ^ 
* activities would find it difficult to differentiate 
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TABLE B*-2 Organizations and Institutions Active in 



Educational 


RpD&E, 1976- 

/ 


77/ and Included 


in ARPOE 




^ ^ - 


Number of 






Number of 


Institutions 






Separate 


in Which These 






Organizations 


Organizations' 


Types of 




Identified 


Were Located 


Institutions 


Public 


688 


631 


37 State educa- 


education 






tion agencies 


agencies 






193 Intermediate 








service 








agencies 








401 liOcal educa- 








tion agencies 


Academi c 


1,268 


423 


Public and private 



junior colleges, 
4-year colleges, 
universities, and 
their divisions; 
educational R&D 
centers 

All others 478 476 Private nonprofit 

and for-profit 
organizations and 
noninstructional 
governmental agen- 
cies ; independent 
educ?ation R&D 
laboratories 



between types of functions in funding* expenditures r end 
staffing. This •was believed to be the case especially 
with respect to basic versus applied research* but also 
for research versus evi^luation and pdlicy studies. 

The definition of evaluation studies also posed a 
problem. The ARROB staff and their advisors saw the heed 
for a fairly restrictive definition, given the propensity 
Of ione respondents* especially those in public education 
tt^einetieSf to include under the heading of evaluation the 
oompilation and reporting of periodio or routine 
Stat Is ties and information. For this reason, arrob 
labelled the relevant category "evaluation and policy 
studies," whieh was defined ajit ^systematic inquiriea 
ipeeifieally addressed to policymakers and intended to 
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inform their major policy decisions. Subsumed are 
assessments and effects of RDD&E-based programs, 
determination of the feasibility of new programs and 
projects, and studies focusing on needs, goals, and 
priorities of action regarding ongoing or contemplated 

,.;«t.ivities^Ji^^T»ius, ARRO£*s definition of evaluation 
actwitiejl differs to some extent from those used by 

\ther inyestigaoors and especially by the Committee. 
WiEh--*e«pect to/the latter, ARROE's definition is both 
more restrictive, because it specifies policy makers as 
the audience, and broader, because it specifically 
includes policy studies. 

Using the ARROE definition, several questions about 
evaluation activities were included in the mail 
questionnaire. Respondents were asked to estimate what 
percentage of their education research, development, 
dissemination, and evaluation (RDD&E) expenditures were 
used primarily for evaluation and policy studies and how 
many full-time and part-time professionals spent the 
greatest percentage of their working hours performing 
evaluation and policy studies. "Project and program 
evaluation" was also listed as one of more than 50 
problem areas among which respondents could select those 
to which their organizational activities were primarily 
directed. 

The discussion on the following pages is based on 
these data and on related analyses of the ARROE data-'base 
(Prankel 1979, Prankel et al. 1979, Lehming 1979, Sharp 
1979, Sharp and Prankel 1979). I believe that this 
discussion is helpful in providing a rough picture of the 
performer universe and especially of those organizations 
'that are most active in what is sometimes called the 
evaluation industry. It would be foolhardy to claim a 
high degree of precision for the numbers presented 
hepe — given such problems as missing data, reluctance on 
part of some performers to respond in detail to questions 
on financial affairs and on staffing, and possible 
respondent misinterpretation or distortion. 
Nevertheless, there is enough consistency within the data 
set and enough congruence between the ARROE-^based 
findings and those of other investigators to provide 
reasonable confidence about the general trends portrayed 
by the data. 
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ESTIMATE OP FUNDS EXPENDED BY EVALUATION PERFORMERS 

On the basis of the ARROE data, I estimate that about 
$100 million in federal funds were spent for education 
evaluation in 1977 by extramural performers. These 
estimates are based on three CKilculations. First, data 
from 80 percent of the 2,434 eligible ARROE respondents 
showed aggregate total expenditures for all education 
research and research-related activities of $735 
million* Adjusting this number for the 20-peroent 
nonresponse, l estimate total RDD&B expenditures by 
educational research performers in 1977 at $900 million. 
Second, data from a subset of respondents (864 
organisations that completed all relevant items on the 
detailed mail questionnaire and reported actual 
expenditures of $355 million) showed that approximately 
22 percent of all RDD&E expenditures were devoted to 
evaluation and policy studies (see Table B-3) . Applying 
this proportion to the total ARROE population, I estimate 
that total expenditures for evaluation and policy studies 
in education were approximately $200 million «^ Third, 
about half of all reported RDD6E expenditures in 1977 
came from federal sources. This proportion may be a 
conservative estimate for evaluation given the. 
characteristics of the principal performers (which is 
d iscussed below) • 

Thus, I estimate that in 1977, extramural performers 
spent at least $100 million for federally funded 
evaluation and policy studies. This figure is 
considerably higher than one would derive for 1979 using 
Reisner's data in Appendix A, and it is also much higher 
than that derived from an available inventory of 
competitive contracts awarded by the education agencies 
in HEW for fiscal 1977 (Kooi et al. 1978); see Table 
B-4. Nevertheless I am reasonably confident that the 
figure may be a valid order-of-magnitude estimate for 
1977 for several reasons: more funding was available in 
1977 than in 1979 (see Table B-4); Reisner's data do not 
include expenditures by public education agencies (SEAs 
and LEAs) , which accounted for a sizable proportion of 
all funds expended; Kooi's data do not reflect grants and 
sole-source awards, nor do they include continuing work 
based on contracts and grants awarded in earlier years, 
including supplements made through contract 
modifications, while the ARROE study did include funds 
for continuations and supplements; the ARROE study also 
included performers who received funds from agencies 
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TABLE B-3 Functional Distribution of Evaluation Expenditures by Sector 



Evaluation 



Sector 


$ 

Thousands 


Research 


Develop- 
ment 


Dissemi- 
nation 


and Policy 
Studies 


Other 


Number of 
Organizations 


Percent 


























9 » 3 


^ D 


. d 


10 . 5 


54 .8 


0 . 2 


22 


100 








22, 


.6 


22 . 7 


20. 9 


3 . 3 


131 


100 


Total 


126,485 


25.3 


23 


, 2 


19.7 


29, 3 


2.6 


153 


100 


Acaciemlc 


147,086 


41.6 


24, 


.4 


16,4 


11, 5 


6.2 


474 


100 


Public 




















Small LEAs 


11,433 


19,8 


29. 


.3 


7.2 


3:. 9 


9.9 


109 


100 


Large LEAs 


20,464 


12,3 


25. 


.6 


7.6 


. 48.6 


5.8 


34 


100 


ISAs 


12,896 


12.1 


25, 


.9 


31,6 


29. 2 


1.2 


55 


100 


SEAs 


35,344 


14.2 


42. 


.4 


15.4 


22, 5 


4.9 


36 


100 


Total 


80,137 


14.1 


32. 


,6 


16.1 


31.9 


5.3 


2 34 


100 


TOTAL 


354,490^ 


29.5 


25. 


,8 


17.7 


22.4 


4.7 


864 


100 



^Includes primarily private nonprofit organizations, including independent nonprofit R&D organizations 
and public organizations (e.g., state and local agencies outside the field; of education, such as hos- 
pitals or health agencies) , as well as those organizations whose profit or nonprofit status could not 
lie determined because they did not supply the information. 
^Includes $782,000 not identified by sector. 
iSOURCEt ARRDE mail questionnaire respondents only. 
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TABIiE B-4 Competitive Procurements in 1977 and 1979 for 
Evaluation Studies by Sector 

1977^ 19792? 

Pvibiic agencies — 45,000 

Academic institutrions 199,000 38,2 38+ 

t^rivdte (profit or nonprofit) 5,326,654 2,664,613 

'IXDTAL $5,525,654 $2,747,851 

\ ■ ■ — • ' ' ~ 

•^Data from Kooi et al, (1978). 

^Preliminary data from Kooi et al * (In press), made available to 
the author. 



Other than HEW (for example, from'DOL or NIH) for work 
that could be classified as education RDD&E; and 
classification differences — in particular the inclusion 
of policy studies — may have inflated the evaluation 
estimates for ARRQB. 



SELECTED CHARACTERISTICS OF PERFORMING ORGANIZATIONS 

Who were the performers of evaluation work in 1977 and 
how were federal funds for evaluation distributed among 
various sectors o£ the performer community? For analytic 
purposes, ARROB classified the performer ^community into 
three major segments: thf» public education sector , which 
included state education agencies (SEAs), intermediate 
service agencies^* ( ISAs) , and local education agencies 
whose enrollment was 10.^000 or more, wt^ch in turn were 
subdivided into large LEAs (with enrollments of 50,000 or 
more) and small LEAs (with enrollments of 10,000-49,000); 
the academic sector, which included public and private 
two-year and four-year colleges/universities, and their 
subdivisions r such as R&D centers, specialized 
institutes, and survey units; and a residual sector, 
which was largely composed of profit and not-for-profit 
r'esearch and development drganizations and educational 
laboratories, but also included hospitals, publishers, 
foundations, ass^ociations, and noneducational agencies of 
state and iQcal governments, such as health and manpower 
agencies* 

As shown in Table B-*5, academic organizations 
represent the largest single group of performers of 
educational research and related activities, followed by 
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TABLE B-5 Distribution by Sector of All RDO&E Performer 
and of Evaluation Performers 



Sector 


(N) 


All RDO&E 






Evaluation 






Thousands 


Percent 


$ 

Thousands 


Percen 


Private 
















Profit 


(22) 


31,208 


8.8 


17,094 


21. 


,5 


Other 


(131) 


95,277 


27. 


,1 


20,151 


25 


.3 


Total 


(153) 


126,485 


35. 


,9 


37,245 


46, 


.8 


Academic 


(474) 


147,086 


41. 


.4 


16,911 


21, 


.2 


Public 
















LEA — small 


(109) 


11,433 


3. 


.2 


3,870 


4. 


,8 


LEA — large 


(34) 


20,464 


5, 


.8 


9,953 


12, 


,5 


ISA 


(55) 


' 12,896 


3, 


,6 


3,778 


4. 


,1 


SEA 


(36) 


35,344 


9. 


.8 


7,873 


9, 


.9 


Total 


(234) 


80,137 


22. 


,4 


25,474 


32, 


,0 


TOTAL 


(864) 


354,490^ 


100. 


.0 


79,64 5* 


100, 


.0 



^Includes $782,000 not identified by sector. 
^Includes $15,000 not identified by sector. 
SOURCE: ARHOE mail respondents only. 



those in the private sector. Pu)3lic education agencies 
accounted for less than one-fourth of all RDD&E 
expenditures.^ With respect to evaluation, however, 
the picture is very different. Organizations in the 
private sector were in first place, followed by public 
education agencies, and academic performers had the 
smallest share. Furthermore, as shown in Tables B-6 and 
B-7, only in two types of organizations — private 
for-profit and local school systems — is there a 
concentration of organizations that spent more than 
$100,000 on evaluation in 1977 or devoted most of their 
resources (50 percent or more) to evaluation activities. 
The data clearly suggest that evaluation is a marginal 
activity for most academic performers, while it plays a 
major tole in sustaining most for-profit organizations. 
How^yec, given the actual numbers of performers involved] 
one should not conclude that most large evaluation 
dollars were spent by private for-profit organizations ii 
''1977 s 5 for«-profit organizations spent in excess of 
$500,000 for evaluation compared with 12 not-for-profit 
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TABLE B-6 Level of Expenditures for Evaluation by Reporting Organizations 
(percent of organizations) 



Type of Organization 



Private Public 
Level of ■ ^ 



Expenditure. 


All 




All 








LEA-- 


LEA— 


(dollars) 


Organizations 


Profit 


Others 


Academic 


SEA 


ISA 


Small 


Large 


0 


30.6 


17.4 


29.0 


40.2 


10.8 


23.7 


11.7 


5.9 


$1-24,999 


31.6 


8.7 


27.5 


32.2 


18.9 


45.8 


41.4 


11.8 


$25,000-99,999 


21.0 


21.7 


15.3 


18.2 


' 27.0 


20.4 . 


37.8 


20.6 


$100,000-500,000 


13.7 


30.4 


21.4 


8.6 


35.1 


8.5 


9.0 


47.1 


Over $500,000 


3.1 


21.7 


6.8 


0.8 


8,1 


1.7 




14.7 


TOTAL (percent) 


100.0 


99.9 


100,0 


100.0 


99.9 


100.1 


99.9 


100,1 


Number of cases 


873 


23 


131 


478 


37 


59 


111 


34 



SOURCE: ARROE mail respondents only. 
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TABLE B-7 Percentage of Organizations' Total Expenditures 
Devoted to Evaluation 



Sector O . 1-24 25-50 50+ 



Private 

For profit 19.2 15.4 26.9 38.5 

All other 23.7 43.9 18.0 14.4 

Academic 34.9 41.4 11.6 12,0 

Public 

SEA 14.0 51.2 25.6 9,3 

ISA 18.6 52.5 15,3 13,6 

LEA — small 8,8 24.6 26.3 40,4 

LEA--large 5.4 13.5 37.8 43,2 



SOURCE! ARROE mail respondents only. 

organizations and 4 academic organizations that spent 
that amount* 

There are sharp differences among organizations in the 
various sectors of the performing universe* The balance 
of this section examines separately some salient features 
of evaluation performers in each of the three sectors* 



For-profit and Not-for-Prof it Organizations 
in the Private Sector^ 

What is sometimes referred to as the evaluation industry 
is a group of organizations — some profits some 
not-for-profit, some large, others quite modest — that are 
at present the most frequent performers of federally 
funded evaluations in the field of educatioii* With the 
emergence and the predominance of the competitive 
procurement system and the funding of evaluations under 
contracts rather than grants, organizations of this type 
are apparently. best able to mount the prodigious proposal 
writing efforts required for participation in the system 
and to muster and manage the resources necessary to carry 
out large-scale evaluation projects, often under severe 
time constraints* 

Obviously, the ARROE data collection effort, since it 
was not targeted to performers of federally funded 
evaluation but sought instead to capture tlfe universe of 
organizations that contributed to research, development, 
and evaluation in education in 1977, failed to isolate 
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the full set of organizations that are of interest for an 
assessment of federally funded evaluation performers. 
Nevertheless, some of the findings are instructive: 211 
of thi 478 organizations in the residual sector (i.e., 
affiliated neither with academic institutions nor with 
public education agencies) were classified as R&D 
organizations and thus constitute the universe of 
organizations potentially involved in the ''evaluation 
industry* (see Table B-8) . Most of these 211 
organizations spent less than $1 million on all research 
and research-'related activities in 1977, regardless of 
source of funding. The 77 organizations that spent $1 
million or more in 1977 inclu<}e the federally funded 
educational laboratories (a group of not-for-profit 
institutions started with federal funding but now partly 
dependent on grant and contract work) and a number of 
not-for-profit groups primarily oriented to the field of 
education or' educational administration. The ARROE data 
are incomplete (about one-third of the respondents did 
not wish to disclose the information or have their names 
associated with the information if they did disclose it) 
but no more than 15 organizations were identified that 
are members of the "industry"** as popularly conceived 
(System Development Corporation, Abt Associates, American 
Institutes for Research, Educational Testing Service 
(BT8), etc.). Only three such organizations are among 
the 10 private-sector organizations that reported 
expenditures of more than $5 million for all education 
RDD&Ei the other 7 organizations were educational 
laboratories, not-for-profit education centers, and 
hospitals, presumably engaged in research centered on the 
education of medical personnel.^ 

More than other organizations, those in the private 
sector and especially the major performers depend heavily 
on federal funding for their activities. According to 
the ARROE study, 62 percent of the funding for the 
private sector came from federal sources compared with 48 
percent for the academic sector. Academic institutions 
rely to a greater extent on state and local government 
fundings 19 percent of education RDD&E work in the 
academic sector was'^'funded from state and local sources, 
but only 10 percent of the work in thc^ private sector. 
Large^ private-sector organizations and organizations that 
specialize in education RDD&E in particular have few 
other sources of funding's half of the organizations that 
spent more than 91 million in 1975 for education RDD&E 
received at least 75 percent of their funds from the 
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TABLE B-8 Types of Organizations in Private Sector 



Organizations Spending 
All Organizations $1 Million or More 





N 


Percent 


N 


Percent . 


Education RDD&E 


155 


35 


26 


47 


Other JU3D&E 


56 


13 


9 


16 


Non-RDD&E 


213 


48 


18 


33 


Health care 


50 


11 


3 


5 


Associations, 










labor unions 


35 


8 


3 ^ 


5 


Private schools 


24 


5 






Social science , 


17 


4 






Child care 


16 


4 




1 


All others 


71 


16 


12^ 


22 


Government agencies 


23 


5 


2 


4 


TOTAI* 


447^ 


100 


55" 


100 • 



^Includes government agencies other than public education 
agencies. 

Publishing, Broadcasting — 2. Management Consulting — 2. Informa- 
tion Services — 2. Other — 6. 
^Information not available for 31 cases. 
^Information not available for 3 cases. 
SOURCE: ARIOE mail and telephone respondents. 

federal government, and one-fourth of them received at 
least 90 percent from the federal government. 

The ARROB data show that large performers 
(expenditures of $1 million or more) account for the bulk 
of all expenditures in education RDD&E in the private 
sector I while they are 18 percent of all organizations 
listed in^^AHROE, they accounted for 77 percent of all 
reported expenditures. For the subset of organization 
for which there are more, detailed data# the picture was 
similar; furthermore, expenditures for evalution are even 
more heavily concentrated among major performers than are 
expenditures for all RDD&E (see Table B-9) . But these 
performers do not; fit the image of an industry whose only 
activity and source of revenue is the performance of 
evaluations in the field of education: federally funded 
evaluation work is concentrated in large organizations 
with diversified activities that encompass various 
topical areas (for example, the Rand Corporation, Abt 
Associates, and Applied Management Sciences) or several 
different research functions or activities in education 
(for example, ETS) . 
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TABLE B-9 Distribution of Total Expenditures and 
Evaluation Expenditures in Private and Academic Sectors 
by Major and Minor Performers^ 





Total RDD&E 


Expenditures 




Expenditures 


for Evaluation 




in 1977 




in 1977 






Percent 


Numbe r 


Percent 


Number 


Private sector 










All organizations 


100.0 


35-4 


100.0 


153 


Major organizations 


79.6 


58 


82. 7 


32 


All other organizations 


20.4 


296 


17.3 


121 


Academic sector 










All organizations 


100.0 


943 


100.0 


474 


Major organizations 


50.1 


92 


46.1 


39 


Minor organizations 


49.9 


851 


53.9 


435 


^Major performers are those 


who spent 


more than 


$1 million 


I for 



all RDD&E activities in 1977; minor performers are all others. 
NOTE: "Total RDD&E Expenditures" column is based on both mail 
and telephone respondents. "Expenditures for Evaluation" column 
is based on mail respondents only. . All cases with missing data 
were excluded. 



\ This is not to say that one or another organization 
maV not have come into existence for the purpose of only 
sucttvactivities — or even for the purpose t-^f i^^rf orming** a 
single contract with a given agency* a pt*. ■ 
highlighted in a recent GAO report # especlA^ily wi.ch 
respect^^o former employees (U.S. General" Accounting 
Office 1980). Small performers do carry out a fair, 
amount of educational re.3earch and research-related work, 
and some may fit the image ot the "beltway bandits" so 
prominently mentioned in all the periodic exposes of che 
research and contract world. It is ialso possible that 
such respondents were especially unlikely to return the 
ARROE questionnaire and were interviewed by telephone and 
so were underrepresented in the group from whom detailed 
information was obtained. However, the evidence 
indicates that the bulk of evaluation work is done by a 
relatively small number of well-established and fairly 
large organizations. This hypothesized distribution of 
:,activities across types of organizations Is* confirmed by 
an (incomplete) inventory of competitive evaluation 
contract awards made in 1977 and 1979 (Kooi et al. 1978, 
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in press). An earliei study by Biderman and Sharp (1972) 
led to similar conclusions: while it identified a large 
number of active organizations in the competitive 
proc,urement process, it found that awards, for the 
unrestricted, open procurements roost often went to very 
active bidders, usually large organfzatlons^. Since 1972, 
with increasing emphasis on open competitioHs, this trehd 
has no doubt accelerated. 

As is shown in the next section of this paper, the 
inajor performers of evaluations have large professional 
staffs drawn from a wide range of disciplinary 
backgrounds. Less is known about the smaller 
organizations that perform the balance of federally, 
funded ^evaluations; their activities and staffJLng 
patterns are largely undocumented since they have not 
befcome part of the professional and disciplinary networks 
in which the large organizations participate. 

Evaluation in A(5ademic Instituj^ions 

As^ was shown in Table B-2, evaluation clearly represented 
a smaller share of total RDD&E activities for academic 
organizations than for other performers. Furthermore, 
despite the fact that academic organizations are the 
largest performers of all education HDD&E, the dollar 
amounts involved in evaluation work were relatively 
small. It is not possible to ascertain from the ARROE 
data to what extent academic evaluation expenditures were 
funded with federal dollars obtained directly through a- 
grant or contract from one of the education agencies in 
HEW or with federal dollars that had gone to a state or 
local agency that in turn contracted the evaluation to a 
colj^ege or university. 

when social-science-based evaluation was first used to 
assess^ social programs, academic institutions we^e 
frequent performers of major evaluations, usually under 
grant or sole-source contract arrangements. The reasons 
for a gradual shift from grants to contracts and from 
academic to other types of research performers have been 
amply discussed in a number of publications (see, e,.g., 
Williams 1972), most recently by Levitan and Wurzbutg 
(1979), who claim iihar by 1974 HEW had ruled out further" 
support of evaluations under grants and that sole-soUrce 
contracting became increasingly difficult. They report 
that by 1979 officials estimated that less than 10 
percent of HEW evaluation funds were awarded 
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noncoropetitively. Whether the decline in federally 
funded evaluation activities on the part of academic 
units is due to their decision not to participate in*^ 
competitive procurements, or to lack of success when they 
do so cannot be ascertained from available data. "It is 
clear that they d£ not win many competitive awards: 
Kooi 's inventories' of competitive procurements for 1977 
and 1979 showed only one study ih each of the two years 
that couI,d be unequivocally classified as an evaluation 
study competitively awarded to an academic institution. 

In their study of evaluation performers r Biderman and 
Sharp (1972) -found that only 11 percent of the 1,324 
organizations identified as RFP recipients were 
academically affiliated institutions, and the majority of 
these had received the RPP at the agency's initiative. A 
total of 225 bids were filed for 36 procurements; only 17 
of them were submitted by acjidemically affiliated 
organizations; and wily one awards not for an evaluation 
study, went to an academic organisation. These earlier 
data suggested that academic organizatibns did not 
participate very actively iru the- federally organized 
competitive procurement 'system at that time, and this may 
not have changed a great deal since. 



Evaluation in Public Education Agencies 

Federal dollars are spent by state and local public 
education agencies, primarily to perform evaluations that 
are mandated in conjunction with federally funded 
education activities, in addition, state or local 
agencies may carry out federally funded demonstration or 
research projects that have .built-in evaluation 
components. State or local agencies can also participate 
in competitions for evaluation cQntractsj this is rare, 
hoyeyer, since there are more restricted types pf 
competitive procurements (for example, for various 
demonstration and innovative programs) that are targeted 
primarily to public education agencies and hence are 
preferred by them. 

AS shown in Table B-6 above, evaluation occupies a 
more*prominent place in the activities of ' local education 
agencies than in those of any other sector: mor^ than 40 
percent of such agencies included in the ARROE study 
indicated that more than half of their research and 
research-related activities were devoted to eva]<uation. 
The resources of these education agencies are 'often 
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considerable; among the surveyed organizations that 
reported spending more than $5 million in 1977, two were 
.LEAss Los Angeles and Leon County, Florida • However, 
many of the evaluation activities undertaken by such 
agencies tend to rely heavily on student tests, so that 
the boundaries between ''testing*' and "evaluation" ^are 
often hard to draw. It may be for this reason, or 
perhaps because LEAs do not always identify sources of 
evaluation funding accurately, that LEAs appear to be 
somewhat less dependent on federal funds than are other*^ 
public agencies to carry out their evaluation activities 
(see Table B-10) • 

Evaluation— at least as defined for the ARROE \ 
study-- plays a lesser part for state agencies than it ' 
does at the local level, but (as shown in Table B«-3 
above) the actual amounts involved are larger because of • 
the higher expenditure levels in these agencies. 
Relatively few state and intermediate service agencies 
spent more than 25 percent of their RDD&E resources on 
evaluation. 

According to the Natioj^al Science Foundation (NSF) 
(1980), local personnel generally tend tq perform mos€ 

TABLE B-10 Percent of All Organizations Reporting That 
Half or More of Their Funds Came From Federal Sources 



in 1977 












Organization for Which 
Evaluation was a 
Major Activity 


Organization for Which 
Evaluation was Not a* 
Major Activity 




Number 


Percent 


Number 


Percent 


Public 
' SEA 
ISA 

LEA^ — large 
LEA — small 


26 

37 
33 
84 


73.1 
35.1 
24.2 
11.9 


18 
23 
5 
28 


50.0 
30.4 • 
60.0 
35.7 


Academic 

t 


241 


36.9 


290 ^ 


^9.3 


Private 
Major 
All other 


16 
70 


68.6 
52.9 


8 
74 


62.5. 
62.2 



NOTE: Organization could cl\eck more than one "major activity" 
area. • 
SOURCE t ARROE mail questionnaire respondents. ' , 
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research and related activities in-house, although the 
portion performed extrainurally has increased in recent 
years, from 20 percent in 1966 to close to 40 percent in 
1977. Of that 40 percent, private firms performed 17 
percent; not-for-profit firms, 13 percent; and 
universities and colleges, about 10 percent. The extent 
to vhich this pattern holds for education as compared 
with energy, environment, health, etc. cannot be 
ascertained from the NSP data. However, information from 
a recent survey of school districts (Lyon 1978) indicates 
that on the average only 6 percent of the budget of a 
district's evaluation units was spent on outside 
consultants, although there was considerable variation 
from district to district. State agencies, too, appear 
to perform most work in-house: one recent study reports 
that 73.3 percent of all research and research-related 
activities are conducted by agency staffs (Mathis and 
Walling 1979) . 



PERSONNEL 

The organizations included in ARROE employed 
approximately 22,200 full-time and 12,000 part-time 
professionals in 1977. The distribution of personnel 
matches the distribution of funds, although in the 
aggregate, academic institutions allocate more persons 
per dollar than organizations in the other sectors (see 
Table B-11) . Staff qualifications vary by sector, with 
those in academic organizations most likely to hold a 



TABLE 8-11 Staffing and Funding Allocation for Education 
BDD&E, by Sector, 1977 (in percentages) 



0 


Full-Time 


Part-Time 




Sector 


Professionals 


Professionals 


Funding 


Private 


27 


16 


33 


Academic 


58 


76 


51 


Public 


15 


7 


16 


TOTAL (percent) 


100 


100 


100 


Number 


22,286 


12,024 


$735 million^ 



^Based on reports from 80 percent of respondents, 
SOURCE: ARROE mail^^nd telephone respondents. 
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doctorate degreei more surprising, private-sector 
organizations are more likely to employ people from a 
wider spectrum of academic disciplines (see Table B-12) . 

As waa noted above, roost organizations do not 
specialize , in evaluation, and therefore staff is likely 
to be used interchangeably between evaluation and 
research. Insofar as the ARROE data allow 
differentiation, however, the following characteristics 
apply to those staff who actually worked on evaluation 
studies in 1977. First, the percentage of total staff 
allocated to all evaluation was slightly lower than the 
percentage of expenditures! 22 percent of funds and 17 
percent of personnel were devoted to evaluation and 
policy studies* This is not unexpected since the 
staff/dollar ratio for all RDDfcE is highest in the 
academic sector and lowest in the private sector (see 
Table B-11) and the private sector is the roost frequent 
performer of evaluations. In the absence of data, one 
can only speculate about the reasons for the difference 
in ataff/dollar ratio. It may be due to the greater 



TABLE B-12 Selected Characteristics of Full-Time Staff, 
by Sector, 1977 





Public 


Academic 


Private 


Percent of full-time staff 








wi tti_^oc to r a t e s 


28 


67 


31 


Percent with major field 








of expertise in: 








• , Education 


65 


58 


41 


Psychology 


9 


10 


16 


Other social science 


3 


9 


12 


Humanities 


2 


2 


5 


Physical and biological sciences 


1 


7 


2 


Mathematics, statistics 


7 


2 


5 


Business economics, accounting. 








public administration 


3 


2 


5 


Communications, library science 


3 


3 


7 


Operation research, systems analysis 


4 


1 


4 


Other 


3 


6 


4 



SOURCE: ARPDE mail respondents only; response rate to this ques- 
tion was 40 percent. 
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dependence of private performers on federal funding in 
comparison with public and academic institutions that may 
be able to cover overhead or some personnel costs from 
regular budgets. The availability of low-cost labor 
(graduate students and poat-doctoral fellows) on many 
campuses may also be reflected in these figures; the data 
in Table B-11 suggest that academic institutions are able 
to take advantage of the availability of faculty or 
students for part-time employment. However, the 
difference in staff/dollar ratios may also be due to the 
fact that private, contractors and grantees spend higher 
proportions of their funds on nonpersonnel items such as 
computer work, which is often available at relatively low 
cost in university settings. Another factor may be high 
overhead costs in the private sector due, in part, to 
proposal writing or marketing costs that are especially 
high in that sector. 

Second, there are also some noteworthy differences 
with respect to staff training. Table B-13, which 
presents differences in the presence of doctorate holders 
on the staffs of reporting organizations, uses a 
different base from most of the other data shown in this 
paper. Organizations were categorized according to their 
answer to a question about major activity areas, one of 
which was program and project evaluation. (Respondents 
were free to check as many areas as applied to their 
organizations, and most checked more than one.) 
Respondents were then classified into evaluators and 
nonevaluators based on their answers.^ Again it is 
necessary to bear in mind that not all evaluation 
performers are in the "evaluator" category, but only 
those who indicated that evaluation was a major 
activity. Although in many cases the cell sizes are 
quite small, some comparisons can be made: in the 
academic sector, the participation in research and 
research-related activities of those who have Ph.D.s is 
ubiquitous. About three-fourths of all academic units 
performing this type of work employ r Ph.D.s, whether they 
do evaluations or not. ' In roost other types of 
organizations, there tends to be at least one person with 
a Ph.D.»on the staff, but the number of Ph.D.s is greater 
if one of the major activities is evaluation work. The 
difference is especially striking in public agencies, but 
in the private sector, too, evaluation performers almost 
always have at least one person with a Ph.D. on the 
staff. Only in state agencies does the presence of 
evaluation activities not affect staff characteristics: 
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TABLE B-13 Selected Characteristics of Organizations With and Without Evaluation 
as a Major Activity 



Private 

Private All Small Large 





Profit 


Other 


Academic 


LGAs 


LEAs 


IS As 


SEAs 


Total 


Organization for which evaluation 
























is a major activity 
























PisTT'wnt" nf full— t"4nm «jt"3ff 
























ufi ("h dnctnrate*? 
























0 


6 , 7 


24. 


,6 


B.3 


19 


.7 


5 . 7 


20 . 7 


11 . 5 


13.3 


1-24 


26 .7 


16, 


.4 


5, 


.2 


6 


.6 


31.4 


24 . 1 


11 . 5 


11 . 7 


25-49 


26 .7 


26. 


.2 


10 


.9 


7 


.9 


25 .7 


24 , 1 


50 .0 


17. 5 


50+ 


40.0 


32, 


, B 


75 


.1 


65 .8 


37.1 


31.0 


26.9 


57.5 


Number 


7 


60 




199 




IB 




7 


15 


IB 


324 


Organization for which evaluation 
























is not a major activity 
























Percent of full-time staff 
























with doctorates 
























0 


14.3 


31, 


,7 


11< 


.1 


44 


.4 


28.6 


60.0 


5,6 


19,1 


1-24 


42.9 


10, 


.0 


5< 


.0 


0 


.0 


14.3 


6.7 


27.8 


8.1 


25-49 


42.9 


21. 


7 


9, 


.0 


16 


.7 


57.1 


33.4 


66.6 


13.6 


50+ 


0.0 


36, 


,7 


74. 


.9 


38 


.9 


0.0 


0.0 


0.0 


59. 3 


Number 


15 


61 




193 




76 




35 


29 


26 


435 



SOURCE: ARBOE mail respondents only. 
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there is at leaat one person with a Ph.D. in most 
agencies, regardless of the nature of the work. 
For-profit organizations are especially likely to employ 
Ph.D.s if they are engaged in evaluation, it should be 
noted, however, that the data in this category are from a 
small number of organizations. 

Equally interesting differences can be observed with 
respect to staff specialization, i.e., the presence of 
disciplinary specialists on an organization's staff. 
Table B-14 shows that organizations tor which evaluation 
is a major activity tend to have more diversified 
staffs. This is especially the 'case in the private 
sector, but holds true in the other sectors as well. 

Obviously, staff size, percent of staff with Ph.D.s, 
and diversification of disciplines among the staff are 
not in themselves a guarantee of efficient or 
high-quality performance? in the aggregate, however, they 
furnish some indication of the efforts expended by those 
who carry out evaluation work within the educational 
research community. Generally, the performers of 
evaluation activities tend to be organizations with 
staffs that are larger, better trained, and more 
diversified than the staffs in organizations for which 
other types of research and research-related activities 
constitute a major activity. 



Despite the difficulties of distinguishing between those 
who perform evaluations and those who perform other types 
of educational research, and between those who are funded 
from federal sources and those who are not, some 
differences among performers emerge from the ARROE data. 
Of greatest interest are differences between academic and 
private-sector organizations, since they are the true 
outsiders who perform evaluations under federal 
auspices. The public agencies are important performers 
and their activities are of crucial importance in the 
assessment and evaluation of the impact of federal 
dollars spent on education, but the mechanisms at the 
disposal of the federal government in initiating and 
monitoring evaluations in the public sector are very 
different from those that apply to contracts and grants 
awarded to academic and private organizations. 
Furthermore, public evaluation units exist and function 
to a large extent in a self-contained universe, while the 
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TABLE B-14 Percent of Organizationa with at Least One 
Pull-Time Staff Member in Selected Disciplinary Tields 
(in percentages) 



Math and 

Educaticn Psychology Statistics Other 



Private profit 

Major evaluation 

performer 100.0 71.4 35 .7 100.0 

Other 71.4 57.1 50.0 53,8 

Private other 

Major evaluation 

performer • 87.1 ' 42.6 16.2 66.7 

Other 65.5 29.3 5.2 66.7 

Academi c 

Major evaluation 

performer 78.8 37.6 16.5 52.0 

Other 67.7 30.3 12.2 46.5 

Small LEA 

Major evaluation 

performer 85.9 22.5 19.2 33.3 

Other . 88.9 11.1 H.l 25.7 

Large LEA 

Major t2valuatiorl 

performer 85. 3 55 .9 51. 5 16.7 

other 83.3 50.0 23.3 40.6 

ISA 

Major fcjVr^'' uation 

perfofuer 93 .1 34. 5 31.0 33.3 

Other 73.3 13.3 ^•'^ ^^-^ 

SEA 

Major evaluation 

performer 96.3 22.2 33.3 56.2 

Other 94.1 25.0 22.2 70.4 

All organizations 
Major evaluation 

performer 84,5 37.2 73.8 53.2 

Other 64.3 29.0 12.4 49.4 



SOURCE: ARPOE mail respondents only. 



two other sectors compete, interact* and cooperate with 
respect to much of the evaluation work and related 
activities. 

It is clear from the ARROE data that academic units 
continue to do the bulk of educational research in 
general and that large numbers of well-qualified persons 
are involved in such activities. Universities have at 
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th«ic b«ck and aall ceaoucoaa that can bo used on a 
pact-*tlme baalBi aa th« ARROB data clearly ahow; auoh 
utilisation la often eoonomloal and advantageous. 
ThereCocei It aeeme unfortunate that aoademlc 
inatl tut Ions pactlolpate ao little in one of the moat 
important aegmenta of the work being done today in the 
field of educational reaearch^ namely evaluation. While 
private organizations can to some extent duplicate 
univeralty staffing arrangements through the use of 
consultants, including academic consultants, this often 
requires travel, leas opportunity for day*-by-day 
involvement, and higher costa. Such arrangements also 
cannot provide the opportunity available at universities 
for faculty and graduate students to stay in close touch 
vrith practical problems and federal concerns and for 
better articulation between graduate training and 
employment requirements. 

But it is also worth noting that as a result of the 
shift to the private sector, a number of organizations 
have emerged that have large, sophisticated, 
multidisciplinary staffs that are very knowledgeable 
about the major educational issues of the day. Whether 
the present federal procurement system leads to the best 
possible utilization of these resources is not clears 
earlier research (Biderman and Sharp 1972) and anecdotal 
evidence suggest that the timing of requests for 
proposals, the imposition of tight deadlines coupled with 
time-consuming clearance procedures, and the need to 
devote enormous efforts to proposal preparation all 
militate against optimal utilization. In any case, the 
maintenance of this capability is far from certain, given 
the reduction in the volume of federal evaluation 
procurements in education and the ability of many pf the 
private-sector firms to redeploy personnel to areas such 
as energy, or ^transportation, or defense, which may be of 
higher priority than education. The loss of these 
specialists will be detrimental to the field of 
educational research, which has long suffered from a' 
narrow and parochial perspective. 

As the report and other cited sources show, a 
convincing case can be made that the current procurement 
system is not designed for optimal efficiency. 
Increasingly, the choice of grants or contracts as a . 
means of supporting work is not based on substantive 
considerations, and- the eligibility criteria (based on 
such categorical descriptors as profit or not-for-profit, 
minority-owned, etc.) may preclude performance by 




wall-qualified organiasationa , The contracting Byatem Is 
a necesaary ingredient o£ a government pcooesa in which a 
heavy activity and aervlce load is mandated together with 
low federal personnel ceilings (Sharkansky 1980), but it 
needs to be made more flexible. The data presented in 
thia paper suggest that most evaluation work in education 
commissioned at the national level is done by performers 
who have the, experience and resources to perform it wellr 
despite occasional awards that are open to question (U.S. 
Generfil Accounting Office 1980). But the universe of 
performers is a relatively narrow one. The 
diversification of this universe through greater 
participation by univeraity-based reaearch groups, the 
preservation of existing proven resources in the private 
aector, and improvementa in the procurement ayatem ahould 
be of concern to thoae who aeek to increaae the quality 
and utility of evaluationa. / 

NOTES 

1 Thia eatimate ia baaed on Abramaon^a data (1978), 
which ahowed for 1977 a total of only $63.6 million 
for all federally funded evaluationa. While 
Abramaon*a definition of evaluation yields a much 
lower eatimate of total evaluation activitiea than is 
generally uaed by other reaearchera, this figure can 
be uaed to gauge the relative ^harea of expenditures 
by varioua government agencies. Of the $63.6 million, 
HEW accounted for more than half, with welfare 
agenciea accounting for the largeat bloc (more than 
$16 million) and education for the aecond largeat 
(cloae to $14 million). 

2 Becauae of item nonreaponae — especially with respect 
to funding queations — the actual numbera of cases 
available for analysia ia uaually aomewhat am^ller. 

3 Especially in academic inatitutiona, it is not 
uncommon to have aeveral aeparate, autonomoua units 
(for example a achool of education, a aurvey reaearch 
unit, and' the department of paychology) performing 
education research and reaearch-related activities. 
Of the 1,268 academic organizations shown in Table 
B-2, the largeat number (34 percent) were individual 
departmenta, followed by diviaiona or achoola (24 
percent) and bureaus and centers (24 percent). 

4 The data files were examined, for nonresponse bias and 
for mail versus telephone respondent bias, as well as 
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Cor error due to mlaalng dAtA (item nonreaponae) • Por 
the vaclablea available Cor thia analyaia (aise o£ 
organiiationr aectotr eta*)f there were no obvious 
biaaea, but of courae there ia the alwaya unanswered 
question about aharaoter iatioa of reluctant 
reapondenta or nonrespondents that demographio 
variables do not capture • 

5 These data are based on the subset of mail 
respondents. Total expenditure data for all ARROB 
organisations showed the same ranking and order of 
magnituder but ^lightly different 

percentages — aoademio 57 percent, private 33 percent, 
public 16 percent — suggesting that ''active'* public 
education agencies were more likely to return the mail 
questionnaires. 

6 As shown in Table this nomenclature includes a 
few government agencies other than public education 
agencies. 

7 The 10 private-sector organizations that reported 
expenditures of more than $5 million (in most qases 
for fiscal 1977) are Xbt Associates, Inc., Education 
Commission of the States, Education Development 
Center, Inc., Education Finance Center, Educational 
Testing Service, Far Wea^t Mboratory for Educational 
Research and Development, Montefiore ' Hospital and 
Medical Center, Northwest Regional 'Education 
Laboratory, St. Louis Childrens Hospital, System 
Pevelopment Corporation. 

8 None of the contracts criticized on this basis in the 
GAO report were awarded by, an education agency • 

9 I am indebted to Georgine Pion and Robert Boruch of 
Northwestern University for suggesting these 
tabulations and making funds available for the 
required computer work. 

10 But it should be kept in mind that ARROE encompasses a 
highly diverse set of organizations, including some 
th^t specialize in development and dissemination, for 
which these same characteristics may not be relevant 
to work performance. 



Abramson, M.A. (1978) The Funding of Social Knowledge 
Production and Application; A Survey of Federal 
Agencies . Washington, D.C. : National Academy of 
Sciences. 
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I Freda M. Holley 



KaleiaoBoopiq is a good term to desccibo tho «v«lu«tion 
of federal prpgrama «t the looai and atate level. There 
Is enormous variation both from state to state and from 
district to district. Moreover, the practice of 
evaluation di^ffera across programs within those states 
and within those districts.' 

This paper! attempts to give some fla\rDr of that 
variation in Suoh /ireas as evaluation funding and 
budgets, pereonnel, evaluation activities and practices, 
andr finallyj in dissemination and utilization. The 
paper concludes with some discussion of the implications 
of^this variation. The reader is cautioned against a 
quick assumption that such variation is undesirables it 
may well be that such variation is not helpful to those 
making decisions at the federal level, but it must be 
remembered that national program success can only be 
h i\t block by block at the local level, considerable 
*r.ition may be necessary to foster program 
iiUk .ementation and to respond to differing needs at the 
local level Imagination may be required at the rmtiona: 
level to use such variation creatively to the benefit of 
national purposes. It may also be necessary to recognizi 
that it is t>ointless to attempt evaluation at the 
national leCel; one evaluation system cannot serve both 
the local, (state, and national needs, in any case, the 



The auth<y!, a member of the Committee, is head of the 
Office of (Research and Evaluation of the Austin 
in^ependerit School District, Austin, Texas. 
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work to ppfcimlat feh« r^fcurn grom «v«Ui«t:lon af^prta «t 
th« local Anc^ lovul, bgfch for loaal ^ml national 



now ARK WVAt^UATIQNa FUNPWD? 

Our b^Bt «vlU«noo on Ch© «iHtfjint ot vacUtion In faderai 
program ovaiuation at the atat« eduoation agonoy mh) 
and looal «duoation ft9«nqy (i^EA) LovaiB-i« related to 
avaluation budg^ta* Budgaca are a major oonaurn in looal 
and atat© evaluation efforta, o« aourao, and Cor thia 
reaaon moat of the data oolieotlon haa foouaed on them. 
The moat reoent data were aolleoted in a aurvey oC atate 
and large city evaluation unita on behalf oC a taaK force 
on reaouroe allocation in program evaluation appointed by 
Dlviaion H (School Evaluation and Program Development) of 
the American Educational Research Asaooiati'on (AERA) . 
This aurvey (Dresek and Higgina 1980) reported that the 
aiae of LEA budgets for the evaluation of Title i 
programs ranged from zero to $935,000 for Title I program 
budgets of $104,000 to $52 million. Similarly, the range 
of median reported funding expressed as a percent of 
program funding across major programs ranged from 7 
percent for ESBA Title IVC (innovative practices and 
curriculum) to 0.5 percent for p,l. 94-145 (special 
education); see Table C-1 for (fetalis. 

Doss (1979) surveyed large districts in the Southwest 
in order to gather descriptive information about their . 
Title I evaluat^n efforts. This survey revealff similar 
variation! one progi^am with a $3,563,071 budget had an 
evaluation budget of 3l0,000j another prqgram with a 
budget of $2,447,020 had. an evaluation budget of $88,036 
(see Table C-2). The percentages reported by Doss 
closely parallel those from a telephone survey reported 
by Boruch and Cordray (1980). That survey, conducted as 
a part of their larger appraisal of federal program 
evaluation, indicated that in larger districts (xSefined 
as those with enrollments of 25,000 and above), 1.6 
percent of Title I allocations went to evaluation. 

Webster and Stufflebeam (1978) surveyed urban 
districts nationally to gather despriptive information 
about the practice of evaluation in large school 
systems. Althpughr their data are not specific as to 
federal progfam source, the indication of the variation 
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iiMv I ivj .t9„ i!i9U^^i-Mi'Jj^J 



MMKA Tlilti I, Tor (llM- 
ivlviuu tvjtul HtiiclmU M ^\ 

KHm Tit ill I, fur 

Innovatlvg curricula i:^ 

bliintjuab prtHjramH 6 

liSAA Emorfjwncy School 
Aid Act proutama for 
Uoaocjrofjating LEAa 4 

' apacial 

Ll\lv> jqcatjon procjrame 13 



104 
46 
10 



/ V 140 



700 

1!jO 

r)0 

520 

uo 



i,;n4 0 

42 1 0 
440 0 



4*, 354 
600 



10 



() 

1 00 
43 



I'.O 
7.0 



1,5 
2.0 





Larger 


LEAs (number » 


25) 












ESEA Title 1, for dis- 
advantaged students 


21 


1,078^ 


4,710 


52,000 


17 


100 


. 935 


2. 


0 


ESEA Title I, for 
migrant students 


4 


48 


290 


798 


3 


7 


41 


4. 


5 


ESEA Title IV-C, 

innovative curricula 


16 


b 


250 


2,112 


0 


17 


66 


4. 


0 


ESEA Title VII , 

bilinqual programs 


13 


107 


390 


7,372 


0 


1« 


ISC 


3. 


0 


ESAA Emercjency School 
Aid Act programs for 
descHjrecjat.inq LEAs 


IS 


350 


1/410 


9,400 


0 


37 


231 


3. 


0 


t\L, 94-l<i:\ si>ocial 
Gducatioii programs 


12 


110 


510 


10,254 


0 


2 


299 


0. 


5 



NOTE: Low and High designate the lowest and highest values, respectively, reported 
for each budget item by each LEA category. For each LEA having a particular federal 
program/ the percentage of the program budget allocated for evaluation was computed. 
Entered in this table are the medians of these percentages budgeted for evaluation 
SOURCE: Drezek et al , , 1980. 
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TABLE C-2 Title I Evaluation Budgets of 12 Southwestern 
Districts 



District 


Total 
Title I 
Budget 
$ 


Title I 
Evaluation 
Budget 
$ 


Percent 


A 


no response 


75,000 




B 


2,660,923 


25,000 


0,9 


C 


4,311,745 


69,607 


1.6 


D 


4, 188,526 


66 , 320 


1,6 


B 


12,277,805 


75,000 


0,6 


F 


3,374,458 


43,000 


1,3 


G 


9,450,000 


202,973 


2,2 


H 


4, 500,000 


115,661 


2,6 


1 


3,563,071 


10,000 


0, 3 


J 


2,.975,878 


36,740 


1,2 


K 


2,447,020 


88,036 


3,6 


L 


5,485,432 


50,999 


0,9 


Mean 


5,021,351 


71,212«3 


1,4 


Median 


4,188,526 


66,320<3 


1,6 



Inclu<aes only those districts reporting both Title I and evalu- 
ation expenditures. 
SOURCE: Doss (1979) , 



in the amount of federal funds available for evaluation 
also parallels the findings from the later studies (see 
Table C-3). As Table C-3 shows, federal funds constitute 
a considerable^ portion of most school district evaluation 
resources. This is somewhat at odds with the finding in 
Lyon and Doscher (1979) that the funding sources for the 
average evaluation office is 65 percent local, 18 percent 
federal, 15 percent state, and 1 percent other. This 
discrepancy may be related to urban differences and to 
whether flow-through monies are treated as state or 
federal resources. * 

The ranges of funding are as great as they are 
primarily because of the way in which evaluation funding 
is secured and secondarily because of differences in 
evaluation requirements across federal programs and 
across state agencies. One way to illustrate the 
situation is to describe how funds fpr evaluation of 
three different federal programs are typically secured 
using the experience of one district ^s a focal point of 
the description. The district is the Austin Independent 
School District, Austin, Texas. Althbugh procedures are 
not exactly the same in othet districts, there is 
considerable similarity. 



TABLE 0-3 Funds Expended on Research and Evaluation Activities Within Large Urban School 
Districts 



Local/state Funds«3 
District ($ thousands) 



New York 


300 


pa 11 as 


1,451 


Philadelphia \ 


1 ,222 


Chicago 


900 


Detroit 


1,20 3 


Boston 




Los Angeles 


BOO 


Baltimore 


0 


Atlanta 


B4S 


Dade County 


402 


Austin 


356 


San Antonio 


271 


Milwaukee 


274 


Cleveland 


260 


St, Louis 


140 


Portland 


411 


Seattle 


350 


Cincinnati 


141 


Fresno 


210 


Nashville-Davidson 


226 


Denver 


336 


San Jose 


275 


New Orleans 


294 


Fort Worth 


155 


Phoenix 


141 


Honolulu 


194 



ERIC 



Federal Funds^ Total Cost Per Student^ 

($ thousands) ($ thousands) ($) 



1 n nnn 


1 n Tnn 

lU , JUU 


y , 49 


1 , 060 


<c , J i i 


18 ,05 


1 , 378 




9.62 


1 , 300 


2 , 200 


4 . 10 


860 


<c , Uo J 


8,63 


650 


1,591 


IB. 37 


7H0 


1 , 590 


2.59 




1 , 299 


H . 17 




1 , 099 


1 3 . 26 


290 


692 


2 . 89 


318 


t) /** 


11.50 


300 


D / i 


9 , 44 


274 


548 


5 . 00 


250 


J iU 


3 . 98 




jUU 

'Id i 


5 . 95 
7 . 64 


75 


4 7 ^ 

4 Z J 


b . / / 


253 


394 


5,98 


180 


390 


7.10 


164 


390 


5.00 


0 


336 


4,50 


60 


335 


9,01 


0 


294 


3,15 


121 


277 


3.89 


120 


261 


6,50 


67 


261 


5.14 
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TABLE C-3 (continued) 



Local/State Funds'^ Federal Funds'^ Total Cost Per Student^' 

District ($ thousands) ($ thousands) ($ thousands) ($) 



Kansas City 


150 


83 


233 


4.15 


Wichita 


102 


114 


216 


4.32 


El Paso 


96 


98 


194 


8.14 


Corpus Christi 


151 


33 


184 


4.49 


Omaha 


9H 


67 


165 


3.07 


Dayton 


J48 


0 


148 


3.59 


Oklahoma City 


L05 


19 


123 


2.57 


Anne Arundol 




0 


114 


1.47 


Orange County 


40 


27 


67 


0.80 


TOTAL 


13,002 


20,904 


33,906 





^These figures are self-reports. Where zeros (0) appear funds may be allocated for planning and evalu 
ation to departments other than the main evaluation department. 

^Student enrollment figures were obtained from the Public Education Directory 1977-78, published by 
Tomi Publications, Chicago, Illinois. 

^This budget is probably somewhat higher since evaluation and research functions are performed by a 
number of different departments. 
SOURCE: Webster and Stuff lebeam (1978) . 
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ESEi^ Title I 



Title I evaluation is the largest federal program 
activity in the Austin Independent School Ditrict (Austin 
ISD) as it typically is in all SEA and LEA evaluation 
units* LEA funds for evaluation are secured as a part of 
an application to the SEA. The evaluation is developed 
by the Austin ISD as one component of an overall Title I 
program. The component sets out the scope of work to be 
performed, identifies the personnel to carry it out, and 
develops a budget for the activity. The amount of the 
budget for the evaluation component is initially 
established by the district on the basis of a district 
policy statement that ties evaluation funding to program 
size on a sliding-scale guideline. (This approach is not 
typical since most agencies lack such a policy 
statement.) What goes into the Title I application for 
evaluation is generally affected by the attitude of the 
LEAS toward evaluation, the way in which the application 
content is controlled within the LEA, the evaluation 
capability of the LEA, and in turn, by all those same 
factors at the SEA level. In the Austin ISD, the 
development of applications -is watched rather closely by 
both the school board and by the top district 
management. Moreover, the staff of the department 
handling federal program fuafS applications is favorable 
toward research. In Austin at one time, and in many 
districts today, the application content could be almost 
entirely controlled by the application writer. When this 
is true and when the writer is not favorable toward 
evaluation, it can have considerable impact on the 
evaluation capability. 

Once developed, the application is negotiated by the 
district program officer with the SEA program officer. 
The entire application is generally under the supervision 
of one SEA consultant; the SEA evaluation unit will 
almost never be involved in the review or negotiation of 
the application. Similarly, the district evaluation 
staff will typically not be involved in the negotiation. 
The SEA program officer is very unlikely to have seen the 
district evaluation report from the previous year and may 
well have little appreciation for the cost of 
evaluation. Since the LEA program officer will likely 
negotiate with the SEA program officer, the former's 
willingness to support the evaluation budget will be 
crucial at this point. When this kind of situation 
exists, of course, the positive or negative nature of the 
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last evaluation report may well influence the LEA program 
k^fficer's willingness to offer that support. 

In aumraary, the Title I evaluation budget at the local 
level may be influenced by a number of political factors 
many of which will not favor rigorous evaluation and 
reporting. A better model would provide for involvement 
of the SEA evaluation staff throughout the application 
and approval process. Not only would evaluation 
activities get less one-sided consideration, but— roore 
important — evaluation staff could introduce improvements 
into the program plans based on the results of completed 
studies. 



Emergency School Aid Act (ESAA) 

ESAA programs have been another source of considerable 
evaluation funding in the past, particularly for urban 
school districts. When the initial guidelines for 
application were issued, they were in many ways model 
guidelines for the development of high-quality 
educational proposals and programs. They set up criteria 
for scoring proposals that were based on a number of 
aspects of the program including the objectives and the 
evaluation. The forms were laid out so that the 
activities and evaluation should flow from the 
objectives. It has been interesting to watch what has 
happened to Vh2 actual awarding of grants In view of that 
model. 

The' Austin ISD annually goes through an elaborate 
process of proposal development that involves community « 
hearings, working with an advisory group, and extensive 
staff involvement. The product of such extensive 
political input is usually a huge, uncontrolled set of 
small fragmented components^ one of which is evaluation. 
In the Austin ISD the resulting product usually involves 
every school campus, some community outreach, ^nd various 
disciplines from counseling to remedial reading. Even 
under normal resource constraints, an ©valuator would 
stand in awe of trying to develop accountability measures 
for implementation and achievement of objectives. There 
are, however, some additional resource constraints that 
have at times made the task out of the question; they are 
discussed below. 

After the proposals are put in final form by the LEA, 
they are reviewed by SEA representatives and submitted to 
the federal level. Until 1979, proposals were submitted 
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to the regional office; now the Washington ESAA office 
staff handles the projects. The ESAA office customarily 
brings reader panels In to review the proposals. These 
readers try to apply the criteria set up In the ESAA 
application process to the proposals. These readers are 
often ESAA program officers from other L^s and from 
SEAs. Again, these readers are unlikely to have any 
knowledge of evaluation. Neither readers nor program 
officers often understand the sophisticated set of 
criteria originally established for ESAA. For example, 
the original guidelines called for awarding points on the 
basis of well-developed objectives. Specific percentages 
were mentioned as desirable. At least regionally this 
was eventually Interpreted as "the more percentages* the 
better." This eventually led to suph meaningless 
objectives as "10% of a 10% sample of high school 
students will score 75% on a measure of Involvement" i 
Our office was told at one point that a comparison ^based 
on a significantly higher performance of a program group 
over a control group was unacceptable. 

in the early 1970sr the Austin ISD did try meaningful 
evaluation In ESAA programs several times. We had 
budgets of as much as $84,000 for a program with a budget 
of $840^000 for the ESAA bilingual component. (At one 
time, Austin had three large ESAA programs: basic, 
pilot, and bilingual, so that the annual ESAA program 
budgets totalled almost $2 million.) More recently, as 
the Impact of Austin's last court order on desegregation 
declined, funding declined as well, and evaluation 
budgets fell more drastically than program budgets. 
Thus, for the last three years, the evaluation/program 
budgets for ESAA basic (the only component remaining In 
Austin by last year) have been respectively: $3,000 and 
$163,970 In 1979; $12,000 and $414,255 In 1978; and 
$5,400 and $488,900 In 1977. The drastic decline In the 
evaluation budget from the early years to 1977 was due to 
a regional, or perhaps national, interpretation of the 
legislation that a set-aside of 1 percent for national 
evaluation was a limit on local evaluation as well. Of 
course, there is a considerable difference between what 
can be done with 1 percent nationally and what can be 
done with 1 percent of a small local budget. Any true 
evaluation of local ESAA became impossible even when that 
evaluation was merely the mandated measurement of 
objectives set out by the SEA. Such objectives had to be 
carefully written around what could be measured by using 
existing district data, whether they had a strong 
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relationship to the program activities or not, since ESAA 
funds cannot be used to purchase tests. 

However/ by that time we had learned that ESAA grants 
were generally going to be funded late and that, 
consequently f program implementatipn„wpuld lag badly • We 
could predict that results from the program would not be 
significant. In addition, for some reason, Austin has 
consistently been placed on hold by the Office of Civil 
Rights for the receipt of ESAA funds, and programs do not 
begin until after school begins — too late for hiring good 
staff or developing good programs. ESAA seems* to this 
writer a model for how not to do federal programming. 



^ ESEA Title VII Bilingual Eo ition 

A third type of evaluation experience came under ESEA 
Title VII bilingual education. For this grant the Austin 
ISD submitted a 5-year proposal directly to the Office of 
Education in the spring of 1976. It had been initially 
reviewed by the Texas Education Agency. Although it is 
customary for Title VII to require third-party 
evaluation, the Office of Education program officer 
working with Austin at that time was, uniquely interested 
in true research and was convinced that the 
organizational placement of the Austin ISD's Office of 
Research and Evaluation, reporting directly to the 
superintendent and the board, did indeed make its program 
independent. The officer believed that it could function 
within the distjrict and with the Office of Education as a 
third-party evaluator and that it could produce work of 
value to bilingual evaluation in a special way. Thi3 
5-year grant has permitted a longitudinal evaluation of 
the district's bilingual education effort that has 
provided distinctive information and has had a real 
influence on the conduct of the bilingual program in the 
school system. It constitutes one of the few 
longitudinal evaluations of bilingual program students in 
the country; the findings have been disseminated through 
a nationarl conference held in August 1980 with the joint 
|upport of the National Institute of Education, the Texas 
Education Agency, the Austin ISD, and a number of other 
agencies. • 

The budgets during those years have been adequate to 
permit a fair^.y high-quality evaluation that focused in 
its early years on implementation and process evaluation 
and later on the longitudinal outcomes. The first-year 
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(1976) evaluation budget was $88,168 with a program 
budget of $845^908; the fifth-year (1980) evaluation 
budget was $60,094 with a program budget of $563,000. 

Sununary 

Federal program evaluations are secured by LEAs through 
applications to one of three agencies: SEAs, regional 
off^ices of the Department of Education, or the Washington 
office of the Department. The LEA application to the SEA 
is typical of Title I, Title I migrant, and Title IV of 
ESBA; of certain vocational programs; and of certain 
special education programs. Generally, these grants are 
** flow- through** monies: that is, funds are allocated to 
states based on such factors as census information about 
ttie number of low-income students in a st:ate. in some 
cases, the state in turn allocates set funds to districts 
based upon similar census information, in other cases, 
such as with Title IVC for innovative programming, funds 
are allocated at the state level on a competitive a^ard 
basis* ESAA grants havi^ come through the regional office 
in the past and more recently through Washington. The 
* ESAA Title VII bilingual grant is typical of awards 
secured directly from Washington. These are generally 
competitive although there is little doubt that political 
factors weigh heavily in the decisions. For example, the 
size and importance of bilingual populations within a 
state and city seem to be important factors in decisions 
on Title VII. 

Methods and sources of funding are constantly changing 
at every level, as indicated by the shift in ESAA funds 
from the regional office to. Washington. Other funds may 
be shifted from Washington to the SEA. Each such change 
results in changes in the procedures for securing funds. 
^Hare is the evaluation office in which staff remains 
sufficiently aware of these changes and of new sources of 
funds to be sure that all the available resources for 
evaluation are tapped. 

At the SEA level, funding for evaluation is typically ' 
a portion of the funds set aside for administrative 
costs. This arrangement tends to pit the evaluation unit 
at the SEA level against the program administration for 
resources. The SEA policy on evaluation may well be the 
determining factor in how much is allocated to 
evaluation. Some states, particularly large ones such as 
Texas, will also have regional units 6r service centers. 
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The service centers' role In federal prcj|^ram evaluations^ 
Is typically not large. They may perform evaluations for 
small districts on a contracted basis. In some cases 
they compete with LEAs for grants, such as Title IVC, and 
their evaluation activities on those grants will parallel 
those of the LEAs; their evaluation reports will be 
provided to the central SEA just as those by the LEAs. 

Regardless of the source of the funds. It should be 
clear that the size and content of the evaluation 
components of all programs are much Influenced by ptogram 
officers at local r federal/ and state levels. In the 
Drezek and Hlgglns (1980) survey, only 21 percent of 
state and local evaluation units reported that evaluation 
costs were allocated on the basis of a fixed percentage 
(see Table C-4). Therefore, It; Is Important to note that^ 
the control of the budget by program officials is llKely 
to have a real impact on the content and potential 
credibility of evaluations. 



WHO DOES EVALUATIONS? 

In most states certification standards ate applied to 
personnel in federal program^. For example, a counselor, 
administrator, or supervisor must be certified to fill 
those roles in Texas. In general, evaluators are not 
certified and no standards are applied to the personnel 
filling the role of evaluator. In somevLEAs and SEAs, 
the federal program director or coordinator may bear full 
responslbillty^Mr evaluation, and even in agencies with 
substantial ^jipfxuation units, small federal evaluations 
" may be done by program staff. Typically, when program 
staff are given the responisibility for evaluation, they 
will have neither training nor experience in evaluation 
methodology, measurement, or statistical analysis. The 
author has observed many small school districts in which 
the person charged with Title I program evaluation is a 
reading teacher brought directly from the classroom, not 
only with no training in evaluation, but also with a weak 
background in mathematics. 

By contrast, in some states and for some programs, 
third-party or contracted evaluations are the rule. The 
qualifications of the personnel in the contracting 
agencies will generally vary as much as those of the 
staff in the LEAs. In addition, although third-party 
evaluations are supposed to ensure a lack of bias, the 
contractor sometimes has an eye on future contracts and 
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TADL^E C-4 Methods Used to Determine Program Evaluation 
Budget in Each Type of Agency 



Method 



An amount is deter- 
mined by the scope 
of evaluation work. 

As much as possible, 
since sufficient 
amount is seldom 
received. 

Other method. 
Examples included 
"all three of the 
above," "no fixed 
rule," need to con- 
.sider salary levels 
of available staff. 



Smaller LEAs 
(number = 28) 
percent using method 



Larger LEAs 
* (number = 24) 
percent using method 



A roughly fixed per- 
centage, of program 
costs is used. 



25 



21 



58 



25 



21 



21 



NOTE: Some respondents indicated using more than one method. 
The number of people who indicated that they used a particular 
method was usually slightly larger than the number who went on 
to report the actual percentage, or range of percentages, used. 
SOURCE: Drezek et ai . (1980). 



may well be gentler in approach than internal evaluatora 
who are permanent staff. 

^ i'inaXly, in many districts and particularly in the 
4arge urbajT^systems,^ well-trained and sophisticated 
evaluator^with doctorates in research and evaluation 
carry out evaluation tasks. Within those districts 
, having research and evaluation units with such staff, 
evaluator competencies are reported to 'be at a 'fairly- 
high level in most traditional evaluation and statistical 
areas, in the Webster and Stufflebeam survey (1978), for 
example, competencies in areas such as multivariate 
inferential statistics, measurement' theory, and 
experimental design were estimated by departments to be 
about 3.5 orv a scale of 1 to 4 where 4 is "advanced 
competency." in newer methodologies such as bayesian 
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analysis and econometric applications, however, the 
est/imates were much lower, ranging from a low of 1.54 
whe/re 1 is "no familiarity." 

/Despite the rather optimistic estimate of the 
competencies existing in the larger evaluation units, the' 
author feels that even in this area there are 
considerable problems both in preservice preparation of 
e\/aluatiori personnel and in-service training for current 
staff. These problems deserve serious consideration. 



f Preservice Evaluator Training 

The competencies required in evaluation are many and 
varied. Boruch and Cordray (1980, Ch. 4:1) point out the 
misconception that any one evaluator ever could or should 
have "all the sKills necessary for any evaluation 
effort." It ia thus obvious that any evaluator training 
\program has to involve choices among the many types of 
^kills that evaluators may eventually need. The training 
that most applicants have evidenced to the author falls 
short of the minimum requirements needed for a public 
School" evalu^^ in three fundamental ways. The 

ajiplicants lack the degree, of statistical and computer 
prbgramming skills needed; they do not have the 
certification required by many public schools; and they 
do )^ot have adequate preparation for dealing with the 
organizational and political context of the public 
schools. Over the years the author has foUifid that it is 
possii^ble to help bright candidates pick up the latter 
skills and even to provide catlW quickly a necessary 
understanding of the evaluation^feaak^Aff^pposed to the 
research task, but the minimum statistical and computer 
skills are an absolute entry necessity. Many of* the 
current "evaluation training" programs focus on 
evaluation theory, but. fail to provide adequate training] 
in the ^fundamental skills. Even though .many school 
systems do make it possible to hire evaluation staff 
without \teacher or administrator certification, few will 
permit the evaluiltor without those credentials to move to 
administitative positions ih the evaluation office. Many 
evaluators do not even realize that such credentials are 
needed although in many cases it might have been 
relatively easy for them to pick up such certification as 
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a part of \their graduate programs. 

There are a number of steps that might lead to better 
preservice training that could ^be t^ken by the Department 
of Education or Congress. For example: 




• Designers of preservlce training programs 
receiving federal support might be required to inyolve 
in-service evaluators; 

• Federal support might- be given to gradual 
training programs that contain provisions for field 
experience and int>i^^^^s in an LEA or SEA; * 

• Field experiences tin an LEA or SEA could be 
, offered early in a training sequence^ thus providing 

exposure to requirement^in those settings; ' 

• Support might be ifiven to interchanges between 
university and SiEA or^EA evaluation staff of one or, two 
semester lengths, so Imat university programs do not 
become too insular/ 



In-Service Evaluator Training 

Since a preservice program cannot possibly give an 
evaluator all the skills that will eventually be needed 
and since many practicing evaluators do not presently 
have even the minimum skills, better in-service training 

oppor tun i t ies for evaluator s are desperately needed . 

Many conditions limit practicing evaluators from 
maintaining and increasing their skills at the present 
time. Public school evaluation is an all-consuming 
role. An evaluator works 12 months, with summer bringing 
the heaviest work load; because resources are pften 
inadequate I the workday and workweek are l ar longer than 
those oiE the average worker. Therefore! once an 
evaluator is on the job, there simply is not sufficient 
time available to renew or enhance skills. Turnover of ; 
evaluation staff is high: the Austin ISD loses 25 
percent of its evaluation staff (15 senior and 20 junior 
professionals)' every year. Perhaps there is such high 
attrition not only because of the time demands but also 
because evaluation is an emotionally difficult field. 
The constant negotiations necessary have been described 
in several chapters of this^report, but inevitably, many 
practicing educators fear and dislike evaluation and 
resent the power that comes with evaluation infojrmation . 
The evaluator must deaJL with those negative feelings on a 
daily basis. At the same time, the professional rewards 
for an evaluator in an LEA or SEA are few. The social 
science research community tends not to. esteem evaluation 
work very highly, and evaluation specialists In 
-universities give limited recognition to work carried on 
elsewhere. Thus, there is, little in an evaluator 
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environment that even encourages staying in the field\ 
not to mention participating in additional training i£^ it 
were available. In fact^ however, additional in-service 
training is really not even available, there are such \ 
things a0 AERA presessions> an^ the Austin ISD staff \ 
regularly participate in thoself There are a few 
week-long university sessions offered during the summer^ 
but summer is the busiest time of the year for an 
evaluator. (The only time with any slack at all in the 
Austin schedule is November, December, and January.) And 
when the evaluator does participate in any of these 
activities, they tend to be piecemeal and disjointed. 

In the face of such a grim diagnosis, are there things 
that could be done to improve in-service learning 
opportunities for evaluators? Yes, but most qf those 
things will be very expensive, such ass 

• Post-doctoral residential programs in which 
evaluators return to university training for a semester 
or twO) 

• The exchange programs between university and 
LEA-SEA staff mentioned above would be beneficial to the 
evaluator as well as the university programs; 

• Special project* assignments at the federal level 
with built-in training by resident staff;. 

• Special training sessions planned and offered on 
a sequential basis at times favorable to LEA and SEA 
evaluation schedules; 

• Visiting scholar programs such as those already 
being offered on a limited basis by the Center for the 
Study of Evaluation. 

In addition to such formal efforts, however, much can 
be done oh an informal basis to encourage an evaluator 's 
professionalism and to provide incentives for learning. 
The author has received enormous benefits in that setose 
from the network membership established through Division 
H of AERA and thte Directors of Research and Evaluation. 
The evaluation report awards gi^^n annually by Division ijl; 
were created to provide recognition for evaluation work. 
The new Journal of Educational E^^aluation and Policy 
Analysis may provide a publication forum for evaluators. 
Recently, the Title I technical assistance center for the 
region serving Texas has brought together the Title I 
evaluators from large cities to form a network 
relationship for this region. Such networks could be of 
considerable help in increasing the professionalism of } 
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fttdfical program evaluation staff related to such other 
programs aa Title VII, special education, and career 
education* 



WHAT HAPPENS IN EVALUATIONS? 

Compliance activities probably predominate in the 
majority of federal program evaluations at both the SEA 
and LEA levels. In many SEAs this may be aliaost the sole 
preoccupation, -They-will-design "annual report docuraenta 
to gather information from LBAs, gather such information, 
and provide it in turn to federal offices. They are 
likely also to conduct or part5.cipate in monitoring 
visits to LEAs to check fiscal and program plan 
compliance. Only a few states currently attempt more 
substantive s?-.die8 designed to influence state plans for 
the use of federal program funds or to evaluate the 
effectiveness of program activities, although the 
activities in several states are noteworthy. 

At the local level, the first priority activities for 
the evaluation unit also may well be data collection 
relative to compliance. For example, cue of the largest 
aspects of Title I evaluation may be the collection of 
data on low* income enrollments by campus, the 
identification of students eligible for service based on 
low achievement, and locating students in nonpublic 
schools or who have dropped out. Until the advent of the 
Title I modelsr much of the reporting involved little if 
any analysis. Similar at^tivities and numbers are 
fundamental in most federal program evaluation efforts. 

After these compliance or record-keeping types of 
activities, the measurement of performance relative to 
set objectives is probably the next most typical 
evaluation activity. Great variety exists across 
programs in the type of objectives established. I have 
already touched on those used in ESAA programs; other 
types may range from achievement outcome objectives to 
service objectives based on the niL^\ber of participants 
served. Tttie survey of Title I programs in the Southwest 
mentioned earlier (Doss 1979) yielded information that 
demonstrates both the nature of Title I objectives in 
reading and a feel for the variety of test instruments 
used. (Some representative samples are shown in Table 
C-5.) Boruch and Cordray (1980, Ch. 5j11-12) have 
appropriately criticized such objectives as arbitrai'y and 
Jjisuff icient as standards for ev&luation. After far too 
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TABLL C-'^ Ruaclinq Achifivement Objectives in Southwestern Title I Programs 



Diutrict Grade (s) Testing Pattern 



On Level 



Expected Gain 



2-9 



Spring 'to Spring 



Yes 



MAT 



DISTAR Reading I Program — 65 percent 
will show a gain of 0.6 mo./nwD. of 
instruction. 

DISTAR Reading II Program — 60 percent 
will 'score at the 2.9 reading level. 
DISTAR Reading III Program, High Inten- 
sity Program, and Reading Skills Pro- 
gram--60 percent will show a gain of 
1 mo»/mo, of instruction. 



Spring 



6-7 



Spring 




Yes 



CAT 



No 



Local 

Criterion- 
Re ferenced 
Test 



55 percent will show a gain of 0.1-0.6 

mo . /mo . of instruction . 

30 percent will show a gain of 0.7-1.0 

mo. /mo. of instruction. 

15 percent will show a gain of 1.1 mo./ 

mo. of instruction or morc.^' 

60 percent will attain f^O percent of 

grade level reading objectives, 

30 percent will attain 51-60 percent of 

grade level reading objectives. 

10 percent will attain 61 percent or 

more of grade level reading objectives. 



rail to Siirlnq Yea 



1 - (t Fa 1 1 t n S f i r i rui Y a s 

7-H Fall to Sjjrinc) Yea 



K-H Fall to Sprint.) YtT. 
.Sept. , Nov., 

Jan., Md r . J May Y e s 

1 Sprinq Only Yt?-s 

J~H Fall tQ Sjjrinq tJo 



SOURCF: Abriiirjfd from Dcias (l^:)70) . 



ERLC 
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CAT For BO percent of the Title I particir 

pants it is expected that the mean 
jtoBttost Btanine will be greater than 
thfi moan pretest atnninu. . For the re- 
maining 20 percent, the post test atanine 
will remain the same as the pretest. 

MAT 75 jjercent will qain at least three per- 

centile points. 

MftT 70 pe r ce n t will gain at least thro e pe r - 

centile points. 

MftT An NCi: qain that exceeds zero. 

G-M An NCE qain that exceeds zero. 



CST or BRST Will make progress in reading readiness. 
CAT Will show an NCE gain from pretest to 

post test on a composite reading score. 
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much experiece in dealing with objectives in evaluation, 
I have concluded that they may be a great tool for 
planning, but they are a poor tool for evaluation. 

Only in a few instances are substantive, iong-range, 
or cumulative effects of federal programs examined. As 
we in Austin ISD have struggled with federal program 
evaluation over the years, we have become convinced that 
such evaluation produces the best information and leads 
to the best utilization. 

An interesting trend in the last few years has been 
toward what have been called "interpretive analyses," 
such as: Impact of Title I; A Decade of Progress (Moore 
and Turner 1976); Limitations of a Standard Perspective 
on Program Evaluation; The Example of Ten Years of 
Te acher Corps Evaluation (Fox 1977); Evaluation in the 
Seventies; What We Have Learned About Program 
Development and Evaluation (Holley 1977). These reports 
try to bring together information gained from discrete 
evaluation efforts either across years or across programs. 



HOW ARE EVALUATIONS REPORTED? 

Kvaluations are reported in a niimbei of ways, both formal 
and informal. There is probably J'>ss uniformity from 
district to dist-T^ct in reporting than in either 
budgeting or ia acti\'i.ties . Ana in, it may be 
illustrat ivs t..^ uf t '^he Aui:*-in JoD procetiures as the 
center of thin d iscusr-ion c-f rej>orting. ESEA Title I 
involves t:he mo.< . elaborate *:eix>rting and is therefore 
used t^ie example. The flow or informat*^ ion is charted 
in Figjre C-1. 

The Mchool year ia Austin runs rror> :cly 1 each year 
to Ju^'^ 30 the following yeai . Austin ;> iiaior reports 
come at t^e of \:he year ,arKi '^h^ month o.'i June is a 
hectic, t ill month o^' a .aiysis, interpretation, and 
repot ^ .writing Ai> ^^c all Austir ISD evaluation 
projects, the Title I evaluation staff prepate a final 
technical rep'^'-t and a 15--par final rroort. The 
technical re^ ^^t consist:? of apo ndicei^^ covering c ,^h 
data coIluH'tica effort. Iw La . ag and voluminous? only 
a few copie^^ are producet?*. Tho 15-page report goes intr 
a book called F indings Volu T.,ie. The sho* t .cc-port is 
major comir.injio^t?on vehicle ab)ut the i^.oject- It covets 
the essential ::j*,r>u3i:s first- thon describes l:he project 
and the evaj.uation and prov. vis some iiscussion of t'.ie 
results. Th, s sbC j-t report evolv ed from our growinc, 
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FIGURE C-1 Evaluation reporting for EfEA Title I in 
Austin, Texas. 
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c^fl«rlence that reports longer than 15 pages never got 
ttsad at all. In addition, Title I staff must complete an 
AIR report — an annual information report — to the SEA. 
This is a form containing numbers, analysis of the 
achievements of various components, and a space to 
Indicate changes to be made as a result of the evaluation 
data. The Texas Education Agency has put considerable 
effort into improving this reporting iEorm over the years 
in an attempt to encourage good evaluation and 
utilization. 

The AIR report is signed by the superintendent and 
submitted to the Texas Education Agency. It is not 
reviewed by the school board primarily because the board 
will receive the Findings Volume , which contains the same 
results but in the usual district format. The format is 
of concern because, given the limited time available for 
the presentation and discussion of evaluation results, it 
is important not to have to expend time or effort to 
explain differing formats. Soon after June 30, which is 
the annual deadline for the completioa-of-J inal 
reporting, a session with the school board to review all 
results is held. Thereafter, all reports become public 
information and freely available. Copies of both the 
technical and final reports are placed in the boar<a 
office, the district's professional library* and the 
Office of Research and Evaluation. Presentations of the 
results are then arranged early in the school year for 
principals, instructional staff, and var ious other_ - 
groups. All of these formal^presentatlons, however , are 
not nearly so important as the informal discussions that 
subsequently occur. Knowledge of important findings 
relevant to a specific instructional supervisor or 
administrator may be shared over coffee or lunch, in ^ 
particular, findings may be reviewed during planning 
sessions for particular programs or activities. 

A follow-up reporting activity for the past few years 
has been a short brochure summarizi Ti' lr I results. for 
teachers and parents. Results are ;wjnticne»i in 

newsletters. 

Another critical rep^^rting per|t> i .. Vit^f i comes 
during the early part of the caLiNt^a^ i^^e^r . It in the 
needs assessment for the preparatioi the next year's 
program plan. This assessment repoLcs uata about where 
students will be and what achievement levels are. Prom 
this report. Title I schools for the following year will 
be designated and. cut-offs for eligibility will be 
established. The report is mainly for in-distrirt use. 



2SQ 



269 



but an abatraot la provided to the board and the volume 
itaei£ placed in the board office. Then it becomes 
public information and is available to the community. It 
la often used by other agencies in the city in their 
preparation of proposals for funds. 

Thua» all reports prepared about Title I are available 
for public scrutiny. I do not know whether this is 
common practice around the country. Although certaia'y 
in Texas all submitted reports are public documents and 
thus available to all, many districts do not make tiie 
availability of reports well known. Alsof reports are 
not alwaya submitted to school boards. This may either 
be because the superintendent wishes to keep the reports 
internal or because the board is not interested in them. 



WHAT IS THE IMPACT AND USE OF EVALUATION? 

Given the picture described above, it would hardly be 
surprising if the impact and use of evaluation at the 
state, regional, or local level were difficult to trace 
or document even if we had good procedures for doing so. 
Much of the current literature on utilization seems to 
conclude that utilization does occur, but that it takes 
^iyerse and difficult-to-trace routes. This writer's 
subjective observations concur with that conclusion. As 
a program officer from another Texas district told a 
group recently, prior to the advent of federal programs 
you could walk into a school and ask how well the 
students were performing and never get anything but 
subjective answers. Now schools all over the state know 
precise levels at which students, schools, and districts 
are performing. Sometimes they can even tell you why the 
levels are what they are. Because federal programs are 
now so pervasive, we often fail to recognize just how 
great the£r impact on the conduct of schooling has been 
It has been clearly demonstrated in Texas that where 
evaluation produces useful results, they do get used in 
prograra design. Eventually. 

This is not to say that impact and utilization are 
what one would wish. It is of major concern to this 
writer that the effects of evaluation are only a fraction 
of what they might have been if the resources that have 
been available had been more carefully guided and 
targeted. However, evaluation has been an innovation and 
we are only now learning m of the things we needed to 
know about its implementation. 



270 



The fundamental lack ot evaluation information that 
could contribute to the overall design ot better programs 
ia one of the moat serious handicaps to extensive use. 
It has been a particular idea of this writer that on 
programs such a6 Title I or Title VII, for which we are 
expending rather large sums in local evaluations, we 
might find better ways to capitalize on that evaluation 
effort. If evaluations of compensatory programs were 
coordinated in even a f?^inimal way, how much richer our 
evaluations might have /been. For example, teachers*s 
aides and other instructional aides are commonly used in 
various compensatory programs, yet, their effectiveness 
has been examined only in an incidental way in a few 
evaluations. What many of us have found in those 
examinations has, however, been disturbing. The data are 
not complete enough for conclusive statements about the 
effectiveness of aides; it might have been if a larger 
number of school districts had examined how aides were 
being used and what the effects were. The use of time is 
another important factor that affects outcomes that some 
of us have stumbled on in our evaluations. Again, dav ^ 
across a large number of districts collected through 
careful observation studies would be far better than 
estimated numbers on every child in Title I filled in 
capriciously from district to district. What are some of 
the ways such an Idea might be accomplished? A number of 
ways can be imagined, varying from fairly indirect to 
direct and controlled. 

In Texas, for example, a number of urban districts 
have regular meetings of their superintendents, 
curriculum staff, and evaluators. These meetings have 
led to the sharing of information among each group. The 
meetings of the evaluation group, the Joint Urban 
Evaluation Council, has resulted in similar studies on 
several topics in the different cities. Measures and 
reports have been exchanged. Support for the national 
directors of research and evaluation (DRE) group, which 
now meets annually for one day prior to the AERA meeting, 
to have more frequent meetings might have similar results 
at the national level. Such a forum could be used for 
the Department of Education to present a set of critical 
issues in compensatory education and possible alternative 
evaluation designs to address these* 

The Title I technical assistance centers (TACs) might 
also be given the task of the informal encouragement of 
such efforts as they work with school districts. In 
informal discussions with one TAG center evaluator, I 




271 



dlBGovarad that auoh encouragement might already be 
happening. Another role that the TACa could play that 
would contribute In the same sense ae the regular DRE: 
meetinga would be that of bringing the Title I evaluatora 
together on a regional baals. Although mentioned already 
as a route to Improved In-servlce training for 
evaluatora, It could also be a stimulus to shared designs* 

The fundamental lack of Important evaluation 
information that could contribute to improved programs 
and failure to coordinate information that does exist are 
not the only handicaps to utilization, however. There 
are other factors. First, federal programs in general 
tend not to be of high concern to most local school 
boards and administrators. This can be interpreted more 
as a matter of time available and priority than as a lack 
of interest (Holley 1980). The federal funds in the 
Austin ISD, for example, are currently about 35 million, 
but this is only a fraction of the total district 
operating budget of well over $100 million. While this , 
ratio is smaller than for many districts, it la still 
fairly representative. Austin has had far better 
attention to federal programs and their evaluation since 
the Board of Trustees adopted as one of its top 
priorities to improve the achievement of low 
socioeconomic and minority students. The board adopted 
this priority based on evidence of the enormous deficit 
in the achievement of those students relative to the 
total student body and because they represent a growing 
proportion of the student body. With this general 
priority for these students in the district, federal and 
state compensatory programs come into focus as one of the 
major resources for achieving district priorities. The 
Department of Education may find that strong federal 
program evaluation coincidec «jfith strong district 
evaluation. 

Another obstacle to the use of federal program 
evaluation information is the lack of recognition of 
dissemination needs. Typically, an evaluation is 
coterminus with a program grant. For example, when the 
Austin ISD recently applied for a 2-month extension of 
its 5-year study of the Title VII bilingual program in 
order to provide for more extensive dissemination, the 
recjuest was denied despite the ^ fact that no new monies 
were requested. Had our office not felt the evaluation 
results were so important that we devoted nonfederal 
resourcfes to dissemination efforts and continue to do so, 
much of the value of an important evaluation study would 
have been lost. Such constraints mean in many cases that 
no dissemination of findings ever occurs. 
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Still another barrier to dlaaomlnation lieu in the 
area o£ communiaation. Anyone who hao worked 
aonsiatently In evaluation realiaea that the time 
available Cor conununiaation ot evaluation ceoulte ia 
never adequate. In a large diatriot with many competing 
oommunioation neada and with many evaluations^ thia ia a 
severe problem, deficient evaluation unite develop 
oommunioation atrateglea that permit the teleaooplng of 
information through shorthand forms for reporting. Since 
the data that will have impact at one level of the system 
are not the same aa thoae that will have impact at 
another, the information has to be transmuted innumerable 
times before dissemination ia accomplished • Resource 
needs for this effort may well not be recognized. Thus^ 
the improvement of utilization must come both through 
better^ evaluations that produce more useful information 
and through' better dissemination and promotion techniques 
on the part of the evaluation staff. Both efforts need 
better recognition and better support from Washington. 



CONCLUSION 

Variation is the theme around which this paper ia 
written r and surely that theme has been demonstrated. 
ComplvjKitv oi; relationships may have emerged as a major 
subtheme, however. Figure C-2 lays out some of the 
funding, reporting, and advisory relationships as they 
appear from the experience of the author. Each year the 
complexity seems, to increase with a concurreat decrease 
in the flexibility available to the LEA. 

Every increase, in complexity has tended to bring 
additional reporting demands to the LEA. Ultimately, the 
bulk of that reporting burden falls on students # 
teachers, and -priacipals. To the extent that such 
reporting has moved beyond their central concerns, it 
becomes meaningless bureaucracy. This^ in turn has two 
serious side effects. There will be an increased dislike 
and disrespect for "evaluation," and there will be a 
decreased willingness to hear and utilize evaluation 
results. ^ - 

Both Congress and the Department of Education would be 
-wise to consider such effects in designing national 
evaluation requirements and systems. Ultimately, the 
most successful evaluation of federal programs will be 
that which leads to programs that are winners — winners 
for both students and staff. 
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Funding Reiationihips — ■ 
Reporting Relationship! ^ 
Advisory Relationships 

FIGURE C-2 Relationships in 
programs. 



SEA » State Education Agency 
LEA* Local Education Agency 
PAC - Parent Advisory Committee 
TAG Technical Assistance Center 
CCSO/CEIS = Chief State School Offices/ 
Committee on Evaluation and 
Information Systems 

LEA evaluation o£ federal 
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Individuals Interviewed and External 
Participants in Committee Meetings 



JOKIi ANTKONY, OttioQ ot Adminiatratlon f Manac|an\ant, and 

Uudgetr NAtioriAl Institute ot liiducation 
K^ITH BAKBH, OCflce for Planning and Management, U.S. 

Department of Education 
L. VAUGHN BliANKGNSHIP, Director, Division of Applied * 

Research, National Science Foundation 
liOIS-ELLIN DATTA, Associate Director for Teaohin<j and 

Learning, National Institute of EUlucation 
JANE L. DAVID, President, Bay Area Research Qrour, Palo 

Alto, California 
PRISCILLA (PAT) E. DEVER, /^dministrative Officer, Program 

Evaluation, U.S. Department of Education 
JOHN W. EVANS, former Assistant Commissioner for the 

Office of Evaluation ana Dissemination, Office of 

Education, U.S. Department of Health, Education, and 

Welfare 

JOHN GABUSI, Assistant Secretary for Management, U.S. 

Department of Education' 
EDWARD B. CLASSMAN, Office for Evaluation and Management, 

' U.S. Department of Education 
WILLIAM A. HIGHTOWER, Human Resources Division, U.S. 

General Accounting Office 
HOWARD F. HJBLM, Director, Division of Research and 

Demonstration, Office ojf Vocational and A^Olt 

Education, U.S. Department of /Eduqation 
BOBBY R. HOOVER, Human Rdopurces ^^ij^sion, U.S. General 

Accounting Office 
SAMUEL W. HUNT, Staff. Appropriations Committee, U.S. 

Senate i 

*Af filiations of individua|.s at time of interviews 
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JOHN JONAS, l^«ql^lAtj.V«i AaaiatAnt to former 

H«pr©a«infe«tlv«i ii(U»nb«th Holt;ani«n (iiMthor ot prQVlalon 
In P.li. Q5-^96l fco Afiia(9H» proypoim ©valufltlon in 

WW t\ M3HMINUf Dlaattinlnaition and Improvdmtjnt ot 

Pr«atlq«, N«itipnAl Xuatituto ot isduaation 
RICHARD T, LOUWIT, Dlceafcor, Divielpn oC Wahaviocai and 

Nautili aqlanaaa, National 8olena« Foundation 
HOmm J. MABONi!)Vr ofCioo of evaluation and Management, 

Prci^ram Eavaluatlon,* U,9, Dapartmont of Kduaatlon 
JOHN M, MAYBr Oftfloe of Diraotor, National Institute of 

iSduoatlon 

UINDA MOHRA, Office foe Special Oduoatlon and 

HehabllUatlve Servloea, U.S. Department of USduoatlon 
ElIiUABETH R, RBIBNfSRi National Tefltlng Socvloe Reaeatroh 

Corporation, former staff. Office of dducation, U.S. 

Department of Healthy Eduoatlon, and^ Welfare 
AliFRBD R/i SOtiNUP^i Human Reaourcea Dlvlalon, U.S. nc^neral 

Accountln9 Office 
DOROTHY A, SHUlER, Officii of Evaluation and Management, 

Program E|valuatlon, U 3, Department of Gduoatlon 
JOHN SGAU, Deputy Assistant Secretary for Evaluation and 

Mapagement i| Of flee of Management, U.S. Department of 

Education, 

MARSHALL (MiKlfi) S. SMITH, former Asalatant Commlaalonet 
for Policy Studies, Office of Education, U.S. 

Department of Health, Gducatlon,>/ind Welfare 
CARIi E. WISLER, Office of Evaluation and Management, 

Program Evaluation, U.S. Department of Education 
JOSEPH S. WHOLEV# former Deputy Assistant Secretary for 

Evaluation^ U.S. Department of ^ealth. Education, and 

Welfare 

ROSEMARY C. WILSON/ Director, Division of Follow-Through # 

Office of Elementary and Secondary Education, U.S. 

Department of Education 
THOMAS R. WOIiANIN, Staff Director, Subcommittee on 

Post-Secondary Education, Committee on Education and 

Labor, U.S. HOLse of Representatives 
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